Spark-The-Definitive-Guide

Spark: The Definitive Guide's Code Repository

created at May 15, 2017, 5 p.m.

Scala

186

2,745

2,716

GitHub
spark-xml

XML data source for Spark SQL and DataFrames

created at Nov. 26, 2015, 2:46 a.m.

Scala

40

487

222

GitHub
spark-redshift

Redshift data source for Apache Spark

created at Nov. 13, 2014, 12:08 a.m.

Scala

169

598

348

GitHub
databricks-sdk-py

Databricks SDK for Python (Beta)

created at June 21, 2022, 1:49 p.m.

Python

7

150

23

GitHub
databricks-ml-examples

None

created at May 30, 2017, 10:57 p.m.

Python

31

154

66

GitHub
dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

created at March 26, 2024, 8:21 p.m.

Python

28

1,808

163

GitHub
spark-sql-perf

None

created at April 15, 2015, 9:45 p.m.

Scala

362

564

399

GitHub
mlflow

Open source platform for the complete machine learning lifecycle

created at June 5, 2018, 4:05 p.m.

Python

118

1,574

212

GitHub
spark-avro

Avro Data Source for Apache Spark

created at Sept. 30, 2014, 5:50 p.m.

Scala

73

536

318

GitHub
click

The "Command Line Interactive Controller for Kubernetes"

created at July 7, 2017, 11:50 p.m.

Rust

79

1,008

47

GitHub
spark-perf

Performance tests for Spark

created at June 17, 2014, 10:25 p.m.

Scala

49

280

178

GitHub
pig-on-spark

proof-of-concept implementation of Pig-on-Spark integrated at the logical node level

created at Aug. 19, 2014, 7:07 p.m.

Scala

29

27

14

GitHub
spark-corenlp

Stanford CoreNLP wrapper for Apache Spark

created at Aug. 21, 2015, 8:54 p.m.

Scala

58

371

112

GitHub
tensorframes

Tensorflow wrapper for DataFrames on Apache Spark

created at March 4, 2016, 7:25 p.m.

Scala

77

722

152

GitHub
sjsonnet

None

created at June 15, 2018, 3:57 a.m.

Scala

250

252

48

GitHub
spark-csv

CSV Data Source for Apache Spark 1.x

created at Dec. 3, 2014, 12:56 a.m.

Scala

416

1,048

446

GitHub
koalas

Koalas: Pandas API on Apache Spark

created at Jan. 3, 2019, 9:46 p.m.

Python

55

590

68

GitHub
reference-apps

Spark reference applications

created at Aug. 19, 2014, 1:09 a.m.

Scala

425

655

341

GitHub
LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

created at Feb. 10, 2019, 5:17 a.m.

Scala

41

1,093

686

GitHub
tableau-connector

None

created at Aug. 30, 2019, 12:31 a.m.

Scala

249

12

15

GitHub
sbt-databricks

An sbt plugin for deploying code to Databricks Cloud

created at April 15, 2015, 8:12 p.m.

Scala

346

71

26

GitHub
spark-integration-tests

Integration tests for Spark

created at Sept. 23, 2014, 11:58 p.m.

Scala

348

68

23

GitHub
devbox

None

created at Nov. 30, 2018, 1:31 a.m.

Scala

283

37

13

GitHub