Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache CouchDB

Compare

Claimed by Apache Software Foundation Analyzed 7 months ago

CouchDb is a distributed document database system with bi-directional replication. It makes it simple to build collaborative applications that can be replicated offline by users, with full interactivity (query, add, update, delete), and later "synced up" with everyone else's changes when back online.

274K lines of code

70 current contributors

7 months since last commit

119 users on Open Hub

Activity Not Available
4.75676
   
I Use This

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed 7 months ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

1.4M lines of code

380 current contributors

7 months since last commit

55 users on Open Hub

Activity Not Available
5.0
 
I Use This

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 4 months ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

144K lines of code

2 current contributors

11 months since last commit

24 users on Open Hub

Activity Not Available
3.6
   
I Use This

Apache Hive

Compare

Claimed by Apache Software Foundation Analyzed 7 months ago

Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called ... [More] Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language. [Less]

1.72M lines of code

114 current contributors

7 months since last commit

24 users on Open Hub

Activity Not Available
5.0
 
I Use This

riak

Compare

  Analyzed 4 months ago

Riak combines a decentralized key-value store, a flexible map/reduce engine, and a friendly HTTP/JSON query interface to provide a database ideally suited for Web applications.

198K lines of code

0 current contributors

about 2 years since last commit

18 users on Open Hub

Activity Not Available
5.0
 
I Use This

hazelcast

Compare

  Analyzed about 9 hours ago

Hazelcast is a clustering and highly scalable data distribution platform for Java. Features: Distributed implementations of java.util.{Queue, Set, List, Map} Distributed implementation of java.util.concurrency.locks.Lock Distributed implementation of java.util.concurrent.ExecutorService ... [More] Distributed MultiMap for one-to-many relationships Distributed Topic for publish/subscribe messaging Transaction support and J2EE container integration via JCA Socket level encryption support for secure clusters Synchronous (write-through) and asynchronous (write-behind) persistence Second level cache provider for Hibernate Monitoring and management of the cluster via JMX Dynamic HTTP session clustering Support for cluster info and membership events Dynamic discovery Dynamic scaling Dynamic partitioning with backups Dynamic fail-over Hazelcast is for you if you want to share data/state among many servers (e.g. web session sharing) cache your data (distributed cache) for better performance cluster your application provide secure communication among servers partition your in-memory data send/receive messages among applications distribute workload onto many servers take advantage of parallel processing provide fail-safe data management Hazelcast is pure Java. JVMs that are running Hazelcast will dynamically cluster. Although by default Hazelcast will use multicast for discovery, it can also be configured to only use TCP/IP for enviroments where multicast is not available or preferred. Communication among cluster members is always TCP/IP with Java NIO beauty. Default configuration comes with 1 backup so if one node fails, no data will be lost. It is as simple as using java.util.{Queue, Set, List, Map}. Just add the hazelcast.jar into your classpath and start coding. A test application comes with the Hazelcast distribution that simulates the queue, set, map and lock APIs. You may want to watch the following 12 minute screencast to quickly get started. [Less]

930K lines of code

73 current contributors

3 days since last commit

14 users on Open Hub

Very High Activity
5.0
 
I Use This

ScUtil

Compare

  Analyzed about 6 hours ago

Hundreds of functions of a variety of topics, from statistics to string parsing, module utilities to network tools. Everyone's pet library accumulates features over time. My erlang library got big, fast. I often find myself giving functions from it out to other people, and a lot of my other ... [More] libraries are dependant on ScUtil in various ways, so I figured what the hell, let's give it away. This library is believed to be efficiently implemented at all points. Efficiency tips are, however, both appreciated and taken seriously. ScUtil uses the TestErl library for unit, regression and stochastic testing. ScUtil is free and MIT licensed, because the GPL is evil. ScUtil is written by John Haugeland, from http://fullof.bs/ . [Less]

9.11K lines of code

0 current contributors

about 3 years since last commit

11 users on Open Hub

Inactive
4.8
   
I Use This

Infinispan

Compare

  Analyzed about 8 hours ago

Infinispan is an open source, JVM based data grid platform. Infinispan is a high performance, distributed and highly concurrent data structure. Also supports JTA transactions, eviction, and passivation/overflow to external storage.

669K lines of code

39 current contributors

6 days since last commit

11 users on Open Hub

High Activity
5.0
 
I Use This

Apache Flink

Claimed by Apache Software Foundation Analyzed about 1 hour ago

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. Learn more about Flink at http://flink.apache.org/

1.11M lines of code

277 current contributors

10 days since last commit

9 users on Open Hub

Very High Activity
5.0
 
I Use This

Lokad.Cloud - O/C mapper for Azure

Compare

  Analyzed about 2 years ago

O/C mapper (object to cloud). Leverage Windows Azure without getting dragged down by low level technicalities. Key features * Queue Services as a scalable equivalent of Windows Services. * Scheduled Services as a cloud equivalent of the task scheduler. * Strong-typed blob I/O. * Scalable logs ... [More] and monitoring. * Inversion of Control on the cloud. * Web administration console for cloud services. [Less]

52.3K lines of code

0 current contributors

over 5 years since last commit

5 users on Open Hub

Activity Not Available
4.5
   
I Use This