first-edition by spark-in-action

The book's repo

created at March 25, 2015, 2:54 a.m.

Scala

42 +0

272 +0

191 +0

GitHub
incubator-toree by apache

Mirror of Apache Toree (Incubating)

created at Jan. 7, 2016, 8 a.m.

Scala

48 -1

731 +0

224 +0

GitHub
sparkmagic by jupyter-incubator

Jupyter magics and kernels for working with remote Spark clusters

created at Sept. 21, 2015, 3:35 p.m.

Python

49 +0

1,287 +0

438 +0

GitHub
dist-keras by cerndb

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

created at July 25, 2016, 9:47 a.m.

Python

49 +0

623 -1

170 +0

GitHub
spark-corenlp by databricks

Stanford CoreNLP wrapper for Apache Spark

created at Aug. 21, 2015, 8:54 p.m.

Scala

52 +0

423 +0

120 +0

GitHub
hail by hail-is

Cloud-native genomic dataframes and batch computing

created at Oct. 27, 2015, 8:55 p.m.

Python

55 +0

938 +0

235 +0

GitHub
neo4j-mazerunner by neo4j-contrib

Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.

created at Oct. 28, 2014, 9:33 p.m.

Java

56 +0

377 +0

105 +0

GitHub
incubator-livy by apache

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

created at June 25, 2017, 7 a.m.

Scala

57 +0

857 +1

594 +0

GitHub
graphframes by graphframes

None

created at Jan. 20, 2016, 11:17 p.m.

Scala

58 +0

972 +1

232 +0

GitHub
sparkle by tweag

Haskell on Apache Spark.

created at Nov. 9, 2015, 3:49 p.m.

Haskell

59 +0

444 +0

30 +0

GitHub
joblib by joblib

Computing with Python functions.

created at May 7, 2010, 6:48 a.m.

Python

61 +0

3,679 +9

405 +3

GitHub
kyuubi by apache

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

created at Dec. 18, 2017, 9:05 a.m.

Scala

62 -1

1,947 +6

860 +1

GitHub
magellan by harsha2010

Geo Spatial Data Analytics on Spark

created at June 1, 2015, 1:06 a.m.

Scala

65 +0

534 +1

150 +0

GitHub
docker-spark by sequenceiq

None

created at July 11, 2014, 3:45 p.m.

Shell

65 +0

764 +0

284 +0

GitHub
spark-riak-connector by basho

The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV

created at May 7, 2015, 7:22 p.m.

Scala

66 +0

60 +0

29 +0

GitHub
mleap by combust

MLeap: Deploy ML Pipelines to Production

created at Aug. 23, 2016, 3:51 a.m.

Scala

69 +0

1,496 +2

313 +0

GitHub
spark-avro by databricks

Avro Data Source for Apache Spark

created at Sept. 30, 2014, 5:50 p.m.

Scala

71 +0

540 +0

310 +0

GitHub
sparklyr by sparklyr

R interface for Apache Spark

created at May 20, 2016, 3:28 p.m.

R

73 +0

926 +2

302 +0

GitHub
flint by twosigma

A Time Series Library for Apache Spark

created at Oct. 19, 2016, 5:44 p.m.

Scala

77 +0

992 +0

184 +0

GitHub
spark-testing-base by holdenk

Base classes to use when writing tests with Spark

created at Jan. 30, 2015, 10:23 p.m.

Scala

78 +0

1,497 +4

358 +0

GitHub