Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
created at April 17, 2015, 7:39 p.m.
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
created at July 25, 2014, 8:08 p.m.
Base classes to use when writing tests with Spark
created at Jan. 30, 2015, 10:23 p.m.
REST job server for Apache Spark
created at Aug. 21, 2014, 11:07 p.m.
Sparkling Water provides H2O functionality inside Spark cluster
created at Oct. 13, 2014, 11:06 p.m.
scikit-learn: machine learning in Python
created at Aug. 17, 2010, 9:43 a.m.
PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)
created at Nov. 29, 2015, 10:03 a.m.
DataStax Connector for Apache Spark to Apache Cassandra
created at June 27, 2014, 3:45 p.m.
XML data source for Spark SQL and DataFrames
created at Nov. 26, 2015, 2:46 a.m.
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
created at Nov. 19, 2013, 11:47 p.m.