Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed 15 days ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

1.15M lines of code

441 current contributors

17 days since last commit

47 users on Open Hub

Very High Activity
5.0
 
I Use This

collectd

Compare

  Analyzed about 2 months ago

collectd is a small daemon which collects system information periodically and writes the results to an RRD-file. What does collectd do? collectd collects information about the system it is running on and writes this information into special database files. These database files can then be used ... [More] to generate graphs of the collected data. collectd itself does not generate graphs, it only collects the data. You should use software like drraw to generate pretty pictures from these RRD-files. Nonetheless, sample scripts are included to get you started on own graphing scripts. [Less]

261K lines of code

60 current contributors

5 months since last commit

33 users on Open Hub

Activity Not Available
4.81818
   
I Use This
Licenses: BSD-3-Clause, MIT

Kazoo Platform

Compare

  Analyzed about 1 hour ago

Kazoo is a scalable, distributed, cloud-based telephony platform that allows you to build powerful telephony applications with a rich set of APIs. Designed to handle anything from large carrier to small countries, the Whistle infrastructure can do it all. There are no lock-ins and the software is ... [More] open-source to give you complete freedom. Services include: - Complete redundancy and failover between data centers - Complete replication of all data - Use of Map/Reduce algorithms inside NoSQL databases - Multi-master replication and caching of registrations, active channels and call lookups - Load balancing built-in - Event driven messaging for managing and using calls - A complete REST interface for implementing call flow features [Less]

247K lines of code

32 current contributors

3 months since last commit

14 users on Open Hub

High Activity
5.0
 
I Use This

hazelcast

Compare

  Analyzed 1 day ago

Hazelcast is a clustering and highly scalable data distribution platform for Java. Features: Distributed implementations of java.util.{Queue, Set, List, Map} Distributed implementation of java.util.concurrency.locks.Lock Distributed implementation of java.util.concurrent.ExecutorService ... [More] Distributed MultiMap for one-to-many relationships Distributed Topic for publish/subscribe messaging Transaction support and J2EE container integration via JCA Socket level encryption support for secure clusters Synchronous (write-through) and asynchronous (write-behind) persistence Second level cache provider for Hibernate Monitoring and management of the cluster via JMX Dynamic HTTP session clustering Support for cluster info and membership events Dynamic discovery Dynamic scaling Dynamic partitioning with backups Dynamic fail-over Hazelcast is for you if you want to share data/state among many servers (e.g. web session sharing) cache your data (distributed cache) for better performance cluster your application provide secure communication among servers partition your in-memory data send/receive messages among applications distribute workload onto many servers take advantage of parallel processing provide fail-safe data management Hazelcast is pure Java. JVMs that are running Hazelcast will dynamically cluster. Although by default Hazelcast will use multicast for discovery, it can also be configured to only use TCP/IP for enviroments where multicast is not available or preferred. Communication among cluster members is always TCP/IP with Java NIO beauty. Default configuration comes with 1 backup so if one node fails, no data will be lost. It is as simple as using java.util.{Queue, Set, List, Map}. Just add the hazelcast.jar into your classpath and start coding. A test application comes with the Hazelcast distribution that simulates the queue, set, map and lock APIs. You may want to watch the following 12 minute screencast to quickly get started. [Less]

687K lines of code

65 current contributors

3 months since last commit

14 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Flink

Claimed by Apache Software Foundation Analyzed 16 days ago

Mirror of Apache Flink at git://git.apache.org/incubator-flink.git

1.28M lines of code

209 current contributors

18 days since last commit

7 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Storm

Compare

Claimed by Apache Software Foundation Analyzed 16 days ago

Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language. Storm is fast: a benchmark ... [More] clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate. Storm integrates with the queueing and database technologies you already use. A Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed. [Less]

357K lines of code

92 current contributors

20 days since last commit

6 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Apex

Compare

Claimed by Apache Software Foundation Analyzed about 1 month ago

The Apex Project is an enterprise grade native YARN big data-in-motion platform that unifies stream and batch processing. Apex processes big data in-motion in a highly scalable, highly performant, fault tolerant, stateful, secure, distributed, and an easily operable way. It provides a simple API ... [More] that enables users to write or re-use generic Java code, thereby lowering the expertise needed to write big data applications. [Less]

262K lines of code

55 current contributors

about 1 month since last commit

6 users on Open Hub

Activity Not Available
5.0
 
I Use This

Cluster SSH - Cluster Admin Via SSH

Compare

  Analyzed 2 days ago

ClusterSSH controls a number of xterm windows via a single graphical console window to allow commands to be interactively run on multiple servers over an ssh connection.

6.12K lines of code

0 current contributors

over 2 years since last commit

5 users on Open Hub

Inactive
4.0
   
I Use This
Licenses: No declared licenses

Apache Ignite

Compare

Claimed by Apache Software Foundation Analyzed 16 days ago

Apache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.

1.16M lines of code

104 current contributors

17 days since last commit

4 users on Open Hub

Very High Activity
0.0
 
I Use This

Crate Data

Compare

Claimed by Crate.IO Analyzed 1 day ago

A massively scalable SQL data store. Zero administration required.

159K lines of code

24 current contributors

3 months since last commit

4 users on Open Hub

High Activity
5.0
 
I Use This