Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
created at April 17, 2015, 7:39 p.m.
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
created at March 3, 2016, 4:01 p.m.
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
created at July 25, 2014, 8:08 p.m.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
created at June 2, 2016, 10:21 p.m.
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
created at April 6, 2017, 9:40 p.m.
Base classes to use when writing tests with Spark
created at Jan. 30, 2015, 10:23 p.m.
Stanford CoreNLP wrapper for Apache Spark
created at Aug. 21, 2015, 8:54 p.m.
Apache (Py)Spark type annotations (stub files).
created at Jan. 31, 2017, 1:13 a.m.
REST job server for Apache Spark
created at Aug. 21, 2014, 11:07 p.m.
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
created at Nov. 19, 2013, 11:47 p.m.