Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

scikit-learn

Compare

  Analyzed about 9 hours ago

scikit-learn is a Python module integrating various machine learning algorithms under a common interface. It offers a wide range of methods such as Support Vector Machines, linear models (L1, L2 penalized), logistic regression, gaussian mixture models and more. The large number of algorithms aleady ... [More] implemented allows for easy comparison of accuracy and performance of various algorithms. [Less]

158K lines of code

366 current contributors

1 day since last commit

74 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed about 11 hours ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

840K lines of code

368 current contributors

1 day since last commit

56 users on Open Hub

Very High Activity
5.0
 
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed about 4 hours ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

228K lines of code

42 current contributors

16 days since last commit

45 users on Open Hub

High Activity
5.0
 
I Use This

WEKA

Compare

  Analyzed 27 days ago

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is ... [More] also well-suited for developing new machine learning schemes. [Less]

753K lines of code

3 current contributors

about 1 month since last commit

38 users on Open Hub

Moderate Activity
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 27 days ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

144K lines of code

0 current contributors

over 1 year since last commit

24 users on Open Hub

Very Low Activity
3.6
   
I Use This

YALE -- Open-Source Java Data Mining

Compare

  Analyzed 27 days ago

YALE (Yet Another Learning Environment) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). YALE provides more than 400 data mining operators ... [More] , a graphical user interface (GUI), an online tutorial with hands-on data mining applications, a comprehensive PDF tutorial, many visualization schemes for data sets and data mining results, many different learning and meta-learning schemes ranging from decision tree and rule learners to neural networks, SVMs, ensemble methods, etc. YALE is implemented in Java and available under GPL (GNU General Public License) as well as under a developer license (OEM license) for closed-source developers. [Less]

749K lines of code

0 current contributors

almost 5 years since last commit

17 users on Open Hub

Inactive
4.25
   
I Use This
Licenses: No declared licenses

TensorFlow

Compare

  Analyzed 27 days ago

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy ... [More] computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. [Less]

2.49M lines of code

875 current contributors

about 2 months since last commit

15 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed 27 days ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

141K lines of code

10 current contributors

5 months since last commit

12 users on Open Hub

Very Low Activity
5.0
 
I Use This

gCube

Compare

  Analyzed 26 days ago

gCube is a framework dedicated to scientists. It enables the declarative and interactive creation of transient Virtual Research Environments that aggregate and deploy on-demand content resources and application services by exploiting computational and storage resources offered by private and commercial cloud providers.

107M lines of code

15 current contributors

27 days since last commit

11 users on Open Hub

Very High Activity
5.0
 
I Use This

dlib C++ Library

Compare

  Analyzed about 5 hours ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

415K lines of code

24 current contributors

1 day since last commit

11 users on Open Hub

Moderate Activity
4.75
   
I Use This