awesome-spark/awesome-spark

sparkle by tweag

Haskell on Apache Spark.

created at Nov. 9, 2015, 3:49 p.m.

Haskell

61 +0

447 +0

30 +0

GitHub

spark-fast-tests by mrpowers-io

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

created at April 6, 2017, 9:40 p.m.

Scala

16 +0

436 +0

77 +0

GitHub

delight by datamechanics

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

created at Oct. 26, 2020, 1:56 p.m.

Scala

16 +0

342 +0

53 +0

GitHub

neo4j-spark-connector by neo4j

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs

created at March 3, 2016, 4:01 p.m.

Scala

34 +0

313 +0

112 +0

GitHub

first-edition by spark-in-action

The book's repo

created at March 25, 2015, 2:54 a.m.

Scala

42 +0

273 +0

188 +0

GitHub

joblib-spark by joblib

Joblib Apache Spark Backend

created at Nov. 20, 2019, 7:02 p.m.

Python

9 +0

242 +0

26 +0

GitHub

crossdata by Stratio

DISCONTINUED - Easy access to big things. Library for Apache Spark extending and improving its capabilities

created at Feb. 6, 2014, 9:41 a.m.

Scala

101 +0

169 +0

51 +0

GitHub

spark-connect-go by apache

Apache Spark Connect Client for Golang

created at May 30, 2023, 10:09 a.m.

Go

25 +0

161 +2

32 +0

GitHub

aut by archivesunleashed

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

created at July 6, 2017, 10:13 a.m.

Scala

15 +0

137 +0

33 +0

GitHub

jpmml-evaluator-spark by jpmml

PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)

created at Nov. 29, 2015, 10:03 a.m.

Java

14 +0

94 +0

43 +0

GitHub

spark-connect-rs by sjrusso8

Apache Spark Connect Client for Rust

created at Sept. 18, 2023, 1:32 p.m.

Rust

5 +0

90 +0

15 +0

GitHub

sparkly by Tubular

Helpers & syntactic sugar for PySpark.

created at Oct. 7, 2016, 3:50 p.m.

Python

41 +0

60 +0

9 +0

GitHub

itachi by yaooqinn

A library that brings useful functions from various modern database management systems to Apache Spark

created at April 2, 2020, noon

Scala

5 +0

56 +0

4 +0

GitHub

spark-connect-csharp by mdrakiburrahman

Apache Spark Connect Client for C#

created at April 14, 2024, 11:40 p.m.

C#

2 +0

1 +0

0 +0

GitHub