spark-connect-csharp by mdrakiburrahman

Apache Spark Connect Client for C#

created at April 14, 2024, 11:40 p.m.

C#

2 +0

1 +0

0 +0

GitHub
chispa by MrPowers

PySpark test helper methods with beautiful error messages

created at March 19, 2019, 3:52 p.m.

Python

5 +0

620 +3

68 +0

GitHub
spark-connect-rs by sjrusso8

Apache Spark Connect Client for Rust

created at Sept. 18, 2023, 1:32 p.m.

Rust

5 +0

90 +0

15 +0

GitHub
itachi by yaooqinn

A library that brings useful functions from various modern database management systems to Apache Spark

created at April 2, 2020, noon

Scala

5 +0

56 +0

4 +0

GitHub
joblib-spark by joblib

Joblib Apache Spark Backend

created at Nov. 20, 2019, 7:02 p.m.

Python

9 +0

242 +0

26 +0

GitHub
jpmml-evaluator-spark by jpmml

PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)

created at Nov. 29, 2015, 10:03 a.m.

Java

14 +0

94 +0

43 +0

GitHub
aut by archivesunleashed

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

created at July 6, 2017, 10:13 a.m.

Scala

15 +0

137 +0

33 +0

GitHub
spark-fast-tests by mrpowers-io

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

created at April 6, 2017, 9:40 p.m.

Scala

16 +0

436 +0

77 +0

GitHub
delight by datamechanics

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

created at Oct. 26, 2020, 1:56 p.m.

Scala

16 +0

342 +0

53 +0

GitHub
python-deequ by awslabs

Python API for Deequ

created at Nov. 9, 2020, 9:28 p.m.

Jupyter Notebook

17 +0

730 +3

136 +1

GitHub
kotlin-spark-api by Kotlin

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

created at June 1, 2020, 11:07 a.m.

Kotlin

20 -1

461 +0

35 +0

GitHub
quinn by mrpowers-io

pyspark methods to enhance developer productivity 📣 👯 🎉

created at Sept. 15, 2017, 1:02 p.m.

Python

20 +0

642 +1

99 +0

GitHub
spark-connect-go by apache

Apache Spark Connect Client for Golang

created at May 30, 2023, 10:09 a.m.

Go

25 +0

161 +2

32 +0

GitHub
flintrock by nchammas

A command-line tool for launching Apache Spark clusters.

created at June 4, 2015, 7:14 a.m.

Python

31 +0

638 +0

116 +0

GitHub
spark-daria by mrpowers-io

Essential Spark extensions and helper methods ✨😲

created at Feb. 16, 2017, 3:41 p.m.

Scala

34 +0

754 +0

152 +0

GitHub
neo4j-spark-connector by neo4j

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs

created at March 3, 2016, 4:01 p.m.

Scala

34 +0

313 +0

112 +0

GitHub
spark-xml by databricks

XML data source for Spark SQL and DataFrames

created at Nov. 26, 2015, 2:46 a.m.

Scala

39 +0

505 +0

226 -1

GitHub
sparkly by Tubular

Helpers & syntactic sugar for PySpark.

created at Oct. 7, 2016, 3:50 p.m.

Python

41 +0

60 +0

9 +0

GitHub
first-edition by spark-in-action

The book's repo

created at March 25, 2015, 2:54 a.m.

Scala

42 +0

273 +0

188 +0

GitHub
sparkmagic by jupyter-incubator

Jupyter magics and kernels for working with remote Spark clusters

created at Sept. 21, 2015, 3:35 p.m.

Python

48 +0

1,328 -1

447 +1

GitHub