sparkmagic by jupyter-incubator

Jupyter magics and kernels for working with remote Spark clusters

created at Sept. 21, 2015, 3:35 p.m.

Python

49 +0

1,286 +4

437 +0

GitHub
spark-timeseries by sryza

A library for time series analysis on Apache Spark

created at March 11, 2015, 8:14 a.m.

Scala

134 +0

1,189 +1

427 +0

GitHub
spark-sklearn by databricks

(Deprecated) Scikit-learn integration package for Apache Spark

created at Sept. 2, 2015, 6:44 p.m.

Python

94 +0

1,077 +0

232 +0

GitHub
spark-csv by databricks

CSV Data Source for Apache Spark 1.x

created at Dec. 3, 2014, 12:56 a.m.

Scala

418 +2

1,048 +0

446 +0

GitHub
livy by cloudera

Livy is an open source REST interface for interacting with Apache Spark from anywhere

created at Nov. 17, 2015, 6:55 a.m.

Scala

91 +0

1,004 +0

316 -5

GitHub
flint by twosigma

A Time Series Library for Apache Spark

created at Oct. 19, 2016, 5:44 p.m.

Scala

77 +0

992 +0

184 +0

GitHub
graphframes by graphframes

None

created at Jan. 20, 2016, 11:17 p.m.

Scala

58 +0

971 +1

232 +0

GitHub
adam by bigdatagenomics

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

created at Nov. 19, 2013, 11:47 p.m.

Scala

100 +0

967 +0

304 +0

GitHub
cromwell by broadinstitute

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments

created at April 17, 2015, 7:39 p.m.

Scala

112 -1

958 +1

349 +1

GitHub
sparkling-water by h2oai

Sparkling Water provides H2O functionality inside Spark cluster

created at Oct. 13, 2014, 11:06 p.m.

Scala

178 +0

952 -1

363 +1

GitHub
Mobius by Microsoft

C# and F# language binding and extensions to Apache Spark

created at Oct. 27, 2015, 7:21 p.m.

C#

145 +0

937 +0

212 +0

GitHub
hail by hail-is

Cloud-native genomic dataframes and batch computing

created at Oct. 27, 2015, 8:55 p.m.

Python

55 +0

934 -2

235 +1

GitHub
sparklyr by sparklyr

R interface for Apache Spark

created at May 20, 2016, 3:28 p.m.

R

73 +0

923 +1

302 +0

GitHub
incubator-livy by apache

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

created at June 25, 2017, 7 a.m.

Scala

57 +0

855 +4

594 +0

GitHub
photon-ml by linkedin

A scalable machine learning library on Apache Spark

created at Feb. 3, 2016, 1:12 a.m.

Terra

83 +0

790 +0

185 +0

GitHub
docker-spark by sequenceiq

None

created at July 11, 2014, 3:45 p.m.

Shell

65 +0

764 +0

284 +0

GitHub
spark-daria by MrPowers

Essential Spark extensions and helper methods ✨😲

created at Feb. 16, 2017, 3:41 p.m.

Scala

33 +0

742 +0

147 +0

GitHub
incubator-toree by apache

Mirror of Apache Toree (Incubating)

created at Jan. 7, 2016, 8 a.m.

Scala

49 +0

732 +0

223 +0

GitHub
mongo-spark by mongodb

The MongoDB Spark Connector

created at May 20, 2015, 5:59 p.m.

Java

79 +0

702 +0

307 +0

GitHub
flintrock by nchammas

A command-line tool for launching Apache Spark clusters.

created at June 4, 2015, 7:14 a.m.

Python

33 +0

630 +0

114 +0

GitHub