48
I Use This!
Very High Activity
Analyzed about 13 hours ago. based on code collected about 13 hours ago.

Project Summary

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write.

To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with disk-based systems like Hadoop.

To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells.

Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source.

Tags

apache bigdata cluster clustercomputing distributed distributed_computing ec2 graph_computing hadoop hdfs in_memory java machine_learning mapreduce ml python scala sql streaming streamingdata

In a Nutshell, Apache Spark...

Project Security

Vulnerabilities per Version ( last 10 releases )

Project Vulnerability Report

Security Confidence Index

Poor security track-record
Favorable security track-record

Vulnerability Exposure Index

Many reported vulnerabilities
Few reported vulnerabilities

Did You Know...

  • ...
    Black Duck offers a free trial so you can discover if there are open source vulnerabilities in your code
  • ...
    by exploring contributors within projects, you can view details on every commit they have made to that project
  • ...
    use of OSS increased in 65% of companies in 2016
  • ...
    data presented on the Open Hub is available through our API
About Project Security

Languages

Languages?height=75&width=75
Scala
68%
Java
16%
Python
8%
11 Other
8%

30 Day Summary

Jul 17 2017 — Aug 16 2017

12 Month Summary

Aug 16 2016 — Aug 16 2017
  • 5942 Commits
    Down -3514 (37%) from previous 12 months
  • 439 Contributors
    Down -120 (21%) from previous 12 months