Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed about 22 hours ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

146K lines of code

0 current contributors

2 months since last commit

25 users on Open Hub

Low Activity
3.6
   
I Use This

SpagoBI

Compare

Claimed by The OW2 Consortium No analysis available

SpagoBI is an integration platform focused on business intelligence needs at the enterprise level. It's a full open source solution, no professional edition. SpagoBI offers a complete analytical layer : reporting ,OLAP, data mining, dashboards, free and visual data inquiring, GIS. It is built on a ... [More] flexible SOA architecture and portal approach allowing user to customize their own workspace. [Less]

0 lines of code

0 current contributors

0 since last commit

9 users on Open Hub

Activity Not Available
5.0
 
I Use This
Mostly written in language not available
Licenses: mozilla_p...

lava-server

Compare

Claimed by Linaro No analysis available

Linaro Automated Validation Architecture server, including lava-scheduler and lava-dashboard.

0 lines of code

32 current contributors

0 since last commit

4 users on Open Hub

Activity Not Available
5.0
 
I Use This
Mostly written in language not available
Licenses: AGPL3

Crab - Scikit-Recommender

Compare

  Analyzed about 16 hours ago

Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (NumPy,SciPy, Matplotlib). The engine aims to provide a rich set of components from which you can construct a customized ... [More] recommender system from a set of algorithms. It is designed for scability, flexibility and performance making use of scientific optimized python packages in order to provide simple and efficient solutions that are acessible to everybody and reusable in various contexts: science and engineering. [Less]

4.21K lines of code

0 current contributors

about 12 years since last commit

2 users on Open Hub

Inactive
5.0
 
I Use This

SOFA Statistics

Compare

  Analyzed 1 day ago

SOFA is a statistics, analysis, and reporting program with an emphasis on ease of use, learn as you go, and beautiful output.

37K lines of code

0 current contributors

3 months since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This
Licenses: No declared licenses

Sally Tool

Compare

  Analyzed about 23 hours ago

Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally implements a standard technique for mapping strings to a vector space that is ... [More] often referred to as vector space model or bag-of-words model. The strings are characterized by a set of features, where each feature is associated with one dimension of the vector space. Sally proceeds by counting the occurrences of the specified features in each string and generating a sparse vector of count values. The tool then normalizes the vectors and outputs them in a given format. [Less]

5.62K lines of code

1 current contributors

almost 5 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This

jMotif

Compare

  Analyzed about 20 hours ago

JMotif implements in Java number of methods for timeseries data handling and analysis: * Z normalization of timeseries * Piecewise Aggregate Approximation (PAA) of timeseries * Symbolic Aggregate Approximation (SAX) of timeseries * iSAX (indexed SAX) in order to help one leverage the symbolic ... [More] representation of timeseries, it implements: * TFIDF statistics * Cosine similarity * Sequitur algorithm as well as their application for: * Motif (recurring patterns) detection with SAX * Discord (unique patterns) detection with SAX * Timeseries classification * Timeseries clustering [Less]

4.3K lines of code

0 current contributors

over 2 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This

refine-client-py

Compare

  Analyzed about 23 hours ago

The Google Refine Python Client Library provides an interface to communicating with a Google Refine server.

1.41K lines of code

0 current contributors

over 9 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This
Licenses: No declared licenses

Knowing Datamining

Compare

  Analyzed about 8 hours ago

Datamining framework based on WEKA and Akka.

8.58K lines of code

0 current contributors

over 9 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This

ELKI

Compare

  Analyzed about 7 hours ago

ELKI: "Environment for Developing KDD-Applications Supported by Index-Structures" is a development framework for data mining algorithms written in Java. It includes a large variety of popular data mining algorithms, distance functions and index structures. Its focus is particularly on clustering ... [More] and outlier detection methods, in contrast to many other data mining toolkits that focus on classification. Additionally, it includes support for index structures to improve algorithm performance such as R*-Tree and M-Tree. The modular architecture is meant to allow adding custom components such as distance functions or algorithms, while being able to reuse the other parts for evaluation. [Less]

214K lines of code

2 current contributors

3 days since last commit

1 users on Open Hub

Moderate Activity
5.0
 
I Use This