Projects tagged ‘machine_learning’

scikit-learn

Analyzed 1 day ago

scikit-learn is a Python module integrating various machine learning algorithms under a common interface. It offers a wide range of methods such as Support Vector Machines, linear models (L1, L2 penalized), logistic regression, gaussian mixture models and more. The large number of algorithms aleady ... [More]

8.67M lines of code

398 current contributors

1 day since last commit

80 users on Open Hub

Very High Activity

0 Reviews

I Use This

Mostly written in Python

Licenses: BSD-3-Clause

Apache Spark

Claimed by Apache Software Foundation Analyzed about 17 hours ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More]

1.3M lines of code

374 current contributors

about 23 hours since last commit

56 users on Open Hub

Very High Activity

0 Reviews

I Use This

Mostly written in Scala

Licenses: apache_2

Tags apache bigdata cluster clustercomputing distributed distributed_computing ec2 graph_computing hadoop hdfs in_memory java 8 more...

Natural Language Toolkit (NLTK)

Analyzed about 23 hours ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

234K lines of code

42 current contributors

22 days since last commit

45 users on Open Hub

Low Activity

0 Reviews

I Use This

Mostly written in Python

Licenses: apache_2

Tags artificial_intelligence classifier classifiers computational_linguistics corpora corpus education first_order_logic grammar information_retrieval linguistics machine_learning 19 more...

WEKA

Analyzed about 6 hours ago

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is ... [More]

780K lines of code

3 current contributors

over 1 year since last commit

38 users on Open Hub

Very Low Activity

0 Reviews

I Use This

Mostly written in Java

Licenses: No declared licenses

Tags algorithms analysis api artificial_intelligence association_mining association_rules business_intelligence classifiers clustering data data_analysis data_mining 13 more...

Apache Mahout

Claimed by Apache Software Foundation Analyzed 1 day ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More]

146K lines of code

0 current contributors

2 months since last commit

25 users on Open Hub

Low Activity

0 Reviews

I Use This

Mostly written in Java

Licenses: apache_2

Tags algorithms classifiers clustering collaborative_filtering data_mining datamining dimension_reduction distributed distributed_computing hadoop java library 5 more...

TensorFlow

Analyzed about 5 hours ago

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy ... [More]

3.84M lines of code

798 current contributors

about 10 hours since last commit

23 users on Open Hub

Very High Activity

0 Reviews

I Use This

Mostly written in C++

Licenses: apache_2

Tags c++ deep_learning deep_neural_networks flow_graphs GPU gpu_computing machine_learning mobile numerical_computing python

YALE Open-Source Java Data Mining

Y

Analyzed about 19 hours ago

YALE (Yet Another Learning Environment) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). YALE provides more than 400 data mining operators ... [More]

751K lines of code

0 current contributors

over 9 years since last commit

17 users on Open Hub

Inactive

0 Reviews

I Use This

Mostly written in Java

Licenses: No declared licenses

Tags analysis data data_analysis data_mining development education framework intelligent_data_analysis java java_data_mining kdd knowledge_discovery 5 more...

gCube

Analyzed about 15 hours ago

gCube is a software system specifically designed and developed to enact the building and operation of *large scale infrastructures* providing their users with a rich array of services suitable for supporting the co-creation of *Virtual Research Environments* and promoting the implementation of *open ... [More]

1.49M lines of code

15 current contributors

3 days since last commit

14 users on Open Hub

High Activity

0 Reviews

I Use This

Mostly written in Java

Licenses: EUPL

Tags algorithms analysis batch_processing biodiversity_informatics data_access data_analysis data_cleansing data_infrastructure data_mining data_processing distributed_computing distributed_storage 8 more...

Apache OpenNLP

A

Claimed by Apache Software Foundation Analyzed 1 day ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

157K lines of code

8 current contributors

2 days since last commit

12 users on Open Hub

Moderate Activity

0 Reviews

I Use This

Mostly written in Java

Licenses: apache_2

Tags analysis apache chunker classifier computational_linguistics coreferenceresolution java machine_learning maxent natural_language_processing ner nlp 8 more...

dlib C++ Library

Analyzed about 15 hours ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More]

450K lines of code

25 current contributors

15 days since last commit

11 users on Open Hub

Moderate Activity

1 Review

I Use This

Mostly written in C++

Licenses: Boost_Sof...

Tags algorithms api bayesnet c++ classifiers command_line compression cplusplus cross-platform framework gui library 28 more...

Tags : Browse Projects