spark by dotnet

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

updated at April 27, 2024, 2:45 p.m.

C#

91 +0

1,999 +0

308 +0

GitHub
incubator-toree by apache

Mirror of Apache Toree (Incubating)

updated at April 28, 2024, 11:16 p.m.

Scala

48 -1

731 +0

224 +0

GitHub
itachi by yaooqinn

A library that brings useful functions from various modern database management systems to Apache Spark

updated at April 29, 2024, 3:42 p.m.

Scala

5 +0

54 +0

4 +0

GitHub
spark-gotchas by awesome-spark

Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks

updated at April 30, 2024, 6:38 p.m.

Unknown languages

33 +0

355 +0

82 +0

GitHub
delight by datamechanics

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

updated at April 30, 2024, 9:48 p.m.

Scala

16 +0

335 +0

51 +1

GitHub
spark-notebook by spark-notebook

Interactive and Reactive Data Science using Scala and Spark.

updated at May 1, 2024, 3:08 p.m.

JavaScript

190 +0

3,148 +0

654 +0

GitHub
aut by archivesunleashed

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

updated at May 1, 2024, 4:39 p.m.

Scala

15 +0

133 +0

33 +0

GitHub
first-edition by spark-in-action

The book's repo

updated at May 2, 2024, 11:57 a.m.

Scala

42 +0

272 +0

191 +0

GitHub
aas by sryza

Code to accompany Advanced Analytics with Spark from O'Reilly Media

updated at May 2, 2024, 4:43 p.m.

Scala

148 +0

1,514 +0

1,032 +0

GitHub
sparkmagic by jupyter-incubator

Jupyter magics and kernels for working with remote Spark clusters

updated at May 3, 2024, 11:02 p.m.

Python

49 +0

1,287 +0

438 +0

GitHub
flintrock by nchammas

A command-line tool for launching Apache Spark clusters.

updated at May 4, 2024, 11:07 a.m.

Python

33 +0

631 +0

114 +0

GitHub
livy by cloudera

Livy is an open source REST interface for interacting with Apache Spark from anywhere

updated at May 4, 2024, 5:57 p.m.

Scala

91 +0

1,005 +0

316 +0

GitHub
blaze by blaze

NumPy and Pandas interface to Big Data

updated at May 5, 2024, 3:19 a.m.

Python

195 +0

3,179 -1

393 +0

GitHub
kotlin-spark-api by Kotlin

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

updated at May 5, 2024, 10:58 a.m.

Kotlin

18 +0

441 +0

34 +0

GitHub
oryx by OryxProject

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

updated at May 6, 2024, 10:14 a.m.

Java

209 +0

1,789 +1

405 +0

GitHub
spark-csv by databricks

CSV Data Source for Apache Spark 1.x

updated at May 7, 2024, 12:54 p.m.

Scala

418 +0

1,049 +1

446 +0

GitHub
quinn by MrPowers

pyspark methods to enhance developer productivity 📣 👯 🎉

updated at May 7, 2024, 3:46 p.m.

Python

19 +0

581 +1

91 +0

GitHub
joblib-spark by joblib

Joblib Apache Spark Backend

updated at May 8, 2024, 11:19 a.m.

Python

9 +0

238 +1

26 +0

GitHub
magellan by harsha2010

Geo Spatial Data Analytics on Spark

updated at May 8, 2024, 1:18 p.m.

Scala

65 +0

534 +1

150 +0

GitHub
sparkling-water by h2oai

Sparkling Water provides H2O functionality inside Spark cluster

updated at May 8, 2024, 4:42 p.m.

Scala

179 +1

951 -1

363 +0

GitHub