The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
created at July 6, 2017, 10:13 a.m.
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
created at Oct. 26, 2020, 1:56 p.m.
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
created at April 6, 2017, 9:40 p.m.
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
created at March 3, 2016, 4:01 p.m.
Essential Spark extensions and helper methods ✨😲
created at Feb. 16, 2017, 3:41 p.m.
XML data source for Spark SQL and DataFrames
created at Nov. 26, 2015, 2:46 a.m.
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
created at Nov. 19, 2013, 11:47 p.m.
Base classes to use when writing tests with Spark
created at Jan. 30, 2015, 10:23 p.m.
Sparkling Water provides H2O functionality inside Spark cluster
created at Oct. 13, 2014, 11:06 p.m.
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
created at April 17, 2015, 7:39 p.m.
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
created at June 25, 2017, 7 a.m.
State of the Art Natural Language Processing
created at Sept. 24, 2017, 7:36 p.m.