ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
updated at May 13, 2024, 11:56 a.m.
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
updated at May 13, 2024, 8:43 a.m.
Essential Spark extensions and helper methods ✨😲
updated at May 12, 2024, 6:41 p.m.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
updated at May 10, 2024, 5:12 a.m.
REST job server for Apache Spark
updated at May 9, 2024, 3:16 a.m.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
updated at May 1, 2024, 4:39 p.m.
Interactive and Reactive Data Science using Scala and Spark.
updated at May 1, 2024, 3:08 p.m.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
updated at April 30, 2024, 6:38 p.m.
A library for time series analysis on Apache Spark
updated at April 24, 2024, 9:39 a.m.
(Deprecated) Scikit-learn integration package for Apache Spark
updated at April 17, 2024, 4:13 a.m.
PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)
updated at March 31, 2024, 2:17 p.m.