Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

scikit learn

Compare

  Analyzed 18 days ago

Python module integrating various machine learning algorithms under a common interface. It offers a wide range of methods such as Support Vector Machines, linear models (L1, L2 penalized), logistic regression, gaussian mixture models and more. The large number of algorithms aleady implemented allows ... [More] for easy comparison of accuracy and performance of various algorithms. [Less]

191K lines of code

334 current contributors

18 days since last commit

63 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed 6 days ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

1.13M lines of code

442 current contributors

6 days since last commit

47 users on Open Hub

Very High Activity
5.0
 
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed 11 days ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

211K lines of code

47 current contributors

29 days since last commit

45 users on Open Hub

High Activity
5.0
 
I Use This

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 5 months ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

132K lines of code

8 current contributors

5 months since last commit

25 users on Open Hub

Activity Not Available
3.6
   
I Use This

YALE -- Open-Source Java Data Mining

Compare

  Analyzed over 1 year ago

YALE (Yet Another Learning Environment) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). YALE provides more than 400 data mining operators ... [More] , a graphical user interface (GUI), an online tutorial with hands-on data mining applications, a comprehensive PDF tutorial, many visualization schemes for data sets and data mining results, many different learning and meta-learning schemes ranging from decision tree and rule learners to neural networks, SVMs, ensemble methods, etc. YALE is implemented in Java and available under GPL (GNU General Public License) as well as under a developer license (OEM license) for closed-source developers. [Less]

3.53M lines of code

3 current contributors

over 2 years since last commit

17 users on Open Hub

Activity Not Available
4.25
   
I Use This
Licenses: No declared licenses

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed 8 months ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

460K lines of code

6 current contributors

9 months since last commit

12 users on Open Hub

Activity Not Available
5.0
 
I Use This

gCube

Compare

  Analyzed 17 days ago

gCube is a framework dedicated to scientists. It enables the declarative and interactive creation of transient Virtual Research Environments that aggregate and deploy on-demand content resources and application services by exploiting computational and storage resources offered by private and commercial cloud providers.

22.7M lines of code

3 current contributors

about 1 month since last commit

11 users on Open Hub

New Project
5.0
 
I Use This

dlib C++ Library

Compare

  Analyzed 8 months ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

357K lines of code

23 current contributors

9 months since last commit

10 users on Open Hub

Activity Not Available
4.75
   
I Use This

OpenCog Framework

Compare

  Analyzed 18 days ago

The Open Cognition Framework (OpenCog) is software for the collaborative development of safe and beneficial Artificial General Intelligence. OpenCog provides research scientists and software developers with a common platform to build and share artificial intelligence programs. Programs written ... [More] or adapted for OpenCog may be combined and used in concert with one another for experimentation or to achieve better results compared to their stand-alone counterparts. OpenCog is under active development, but doesn't yet have a official release. It is currently best suited for machine learning developers, but have an interest in making more accessible to new comers. [Less]

124K lines of code

23 current contributors

18 days since last commit

9 users on Open Hub

Very High Activity
5.0
 
I Use This

PyMVPA

Compare

  Analyzed 5 months ago

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

139K lines of code

6 current contributors

about 1 year since last commit

8 users on Open Hub

Activity Not Available
5.0
 
I Use This