Apache Spark datasource for OrientDB
updated at Aug. 3, 2022, 7:26 a.m.
Apache (Py)Spark type annotations (stub files).
updated at Sept. 16, 2023, 6:30 p.m.
The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
updated at Sept. 27, 2023, 10:28 a.m.
Stanford CoreNLP wrapper for Apache Spark
updated at Jan. 21, 2024, 2:22 p.m.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
updated at Feb. 8, 2024, 11:01 a.m.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
updated at Feb. 20, 2024, 9:34 a.m.
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
updated at Feb. 21, 2024, 3:07 p.m.
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
updated at Feb. 29, 2024, 4:50 a.m.
An implementation of DBSCAN runing on top of Apache Spark
updated at March 17, 2024, 12:31 a.m.
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
updated at March 28, 2024, 5:47 a.m.
Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.
updated at March 31, 2024, 2:15 p.m.