kotlin-spark-api by Kotlin

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

created at June 1, 2020, 11:07 a.m.

Kotlin

19 +0

443 +0

34 +0

GitHub
sparkle by tweag

Haskell on Apache Spark.

created at Nov. 9, 2015, 3:49 p.m.

Haskell

59 +0

444 +0

30 +0

GitHub
spark-xml by databricks

XML data source for Spark SQL and DataFrames

created at Nov. 26, 2015, 2:46 a.m.

Scala

40 +0

487 -1

224 +0

GitHub
magellan by harsha2010

Geo Spatial Data Analytics on Spark

created at June 1, 2015, 1:06 a.m.

Scala

65 +0

534 +0

150 +0

GitHub
spark-avro by databricks

Avro Data Source for Apache Spark

created at Sept. 30, 2014, 5:50 p.m.

Scala

70 -1

539 -1

310 +0

GitHub
quinn by MrPowers

pyspark methods to enhance developer productivity 📣 👯 🎉

created at Sept. 15, 2017, 1:02 p.m.

Python

19 +0

583 +1

92 +0

GitHub
flambo by sorenmacbeth

A Clojure DSL for Apache Spark

created at Jan. 7, 2014, 7:42 p.m.

Clojure

78 +0

609 +0

86 +0

GitHub
dist-keras by cerndb

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

created at July 25, 2016, 9:47 a.m.

Python

49 +0

623 +0

170 +0

GitHub
flintrock by nchammas

A command-line tool for launching Apache Spark clusters.

created at June 4, 2015, 7:14 a.m.

Python

33 +0

632 +0

115 +0

GitHub
mongo-spark by mongodb

The MongoDB Spark Connector

created at May 20, 2015, 5:59 p.m.

Java

79 +0

703 +0

307 +0

GitHub
incubator-toree by apache

Mirror of Apache Toree (Incubating)

created at Jan. 7, 2016, 8 a.m.

Scala

48 +0

733 +1

224 +0

GitHub
spark-daria by MrPowers

Essential Spark extensions and helper methods ✨😲

created at Feb. 16, 2017, 3:41 p.m.

Scala

33 +0

742 +0

148 +0

GitHub
docker-spark by sequenceiq

None

created at July 11, 2014, 3:45 p.m.

Shell

65 +0

764 +0

283 +0

GitHub
photon-ml by linkedin

A scalable machine learning library on Apache Spark

created at Feb. 3, 2016, 1:12 a.m.

Terra

83 +0

789 +0

185 +0

GitHub
incubator-livy by apache

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

created at June 25, 2017, 7 a.m.

Scala

57 +0

858 +1

594 +0

GitHub
sparklyr by sparklyr

R interface for Apache Spark

created at May 20, 2016, 3:28 p.m.

R

73 +0

929 +3

302 +0

GitHub
Mobius by Microsoft

C# and F# language binding and extensions to Apache Spark

created at Oct. 27, 2015, 7:21 p.m.

C#

145 +0

940 +1

212 +0

GitHub
hail by hail-is

Cloud-native genomic dataframes and batch computing

created at Oct. 27, 2015, 8:55 p.m.

Python

55 +0

943 +2

238 +2

GitHub
sparkling-water by h2oai

Sparkling Water provides H2O functionality inside Spark cluster

created at Oct. 13, 2014, 11:06 p.m.

Scala

179 +0

952 +0

363 +0

GitHub
cromwell by broadinstitute

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments

created at April 17, 2015, 7:39 p.m.

Scala

112 +0

962 +1

351 +1

GitHub