Apache Spark datasource for OrientDB
created at Oct. 31, 2016, 2:51 p.m.
The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
created at May 7, 2015, 7:22 p.m.
PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)
created at Nov. 29, 2015, 10:03 a.m.
Apache (Py)Spark type annotations (stub files).
created at Jan. 31, 2017, 1:13 a.m.
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
created at March 26, 2018, 7:58 p.m.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
created at July 6, 2017, 10:13 a.m.
An implementation of DBSCAN runing on top of Apache Spark
created at March 15, 2015, 12:45 a.m.
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
created at March 3, 2016, 4:01 p.m.
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
created at Oct. 26, 2020, 1:56 p.m.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
created at June 2, 2016, 10:21 p.m.
Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.
created at Oct. 28, 2014, 9:33 p.m.
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
created at April 6, 2017, 9:40 p.m.
Stanford CoreNLP wrapper for Apache Spark
created at Aug. 21, 2015, 8:54 p.m.