Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 18 days ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

182K lines of code

14 current contributors

26 days since last commit

24 users on Open Hub

Moderate Activity
3.6
   
I Use This

SpagoBI

Compare

Claimed by The OW2 Consortium Analyzed over 1 year ago

SpagoBI is an integration platform focused on business intelligence needs at the enterprise level. It's a full open source solution, no professional edition. SpagoBI offers a complete analytical layer : reporting ,OLAP, data mining, dashboards, free and visual data inquiring, GIS. It is built on a ... [More] flexible SOA architecture and portal approach allowing user to customize their own workspace. [Less]

151M lines of code

16 current contributors

over 1 year since last commit

8 users on Open Hub

Activity Not Available
5.0
 
I Use This

CVSAnalY

Compare

  Analyzed about 1 year ago

CVSAnalY cvsanaly is a tool that extracts information out of source code repository logs and stores it into a database.

5.04K lines of code

0 current contributors

over 5 years since last commit

5 users on Open Hub

Activity Not Available
4.5
   
I Use This

lava-server

Compare

Claimed by Linaro Analyzed 24 days ago

Linaro Automated Validation Architecture server, including lava-scheduler and lava-dashboard.

65.6K lines of code

27 current contributors

28 days since last commit

4 users on Open Hub

High Activity
5.0
 
I Use This

Crab - Scikit-Recommender

Compare

  Analyzed 16 days ago

Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (NumPy,SciPy, Matplotlib). The engine aims to provide a rich set of components from which you can construct a customized ... [More] recommender system from a set of algorithms. It is designed for scability, flexibility and performance making use of scientific optimized python packages in order to provide simple and efficient solutions that are acessible to everybody and reusable in various contexts: science and engineering. [Less]

4.21K lines of code

0 current contributors

almost 6 years since last commit

2 users on Open Hub

Inactive
5.0
 
I Use This

WebHarvest - web data extraction tool

Compare

  Analyzed over 1 year ago

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

22.7K lines of code

1 current contributors

almost 8 years since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This
Licenses: No declared licenses

Sally Tool

Compare

  Analyzed 17 days ago

Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally implements a standard technique for mapping strings to a vector space that is ... [More] often referred to as vector space model or bag-of-words model. The strings are characterized by a set of features, where each feature is associated with one dimension of the vector space. Sally proceeds by counting the occurrences of the specified features in each string and generating a sparse vector of count values. The tool then normalizes the vectors and outputs them in a given format. [Less]

6.13K lines of code

0 current contributors

over 1 year since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This

SOFA Statistics

Compare

  Analyzed 17 days ago

SOFA is a statistics, analysis, and reporting program with an emphasis on ease of use, learn as you go, and beautiful output.

49.1K lines of code

0 current contributors

almost 2 years since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This
Licenses: No declared licenses

jMotif

Compare

  Analyzed over 1 year ago

JMotif implements in Java number of methods for timeseries data handling and analysis: * Z normalization of timeseries * Piecewise Aggregate Approximation (PAA) of timeseries * Symbolic Aggregate Approximation (SAX) of timeseries * iSAX (indexed SAX) in order to help one leverage the symbolic ... [More] representation of timeseries, it implements: * TFIDF statistics * Cosine similarity * Sequitur algorithm as well as their application for: * Motif (recurring patterns) detection with SAX * Discord (unique patterns) detection with SAX * Timeseries classification * Timeseries clustering [Less]

60.5K lines of code

2 current contributors

over 1 year since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This

Knowing Datamining

Compare

  Analyzed 7 days ago

Datamining framework based on WEKA and Akka.

8.58K lines of code

0 current contributors

over 3 years since last commit

1 users on Open Hub

Inactive
0.0
 
I Use This