sedona by apache

A cluster computing framework for processing large-scale geospatial data

created at April 24, 2015, 6:01 p.m.

Java

95 +0

1,956 +2

693 -2

GitHub
kotlin-spark-api by Kotlin

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

created at June 1, 2020, 11:07 a.m.

Kotlin

20 -1

461 +0

35 +0

GitHub
ipex-llm by intel-analytics

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

created at Aug. 29, 2016, 7:59 a.m.

Python

251 +0

6,718 +29

1,264 +3

GitHub
spark-fast-tests by mrpowers-io

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)

created at April 6, 2017, 9:40 p.m.

Scala

16 +0

436 +0

77 +0

GitHub
quinn by mrpowers-io

pyspark methods to enhance developer productivity 📣 👯 🎉

created at Sept. 15, 2017, 1:02 p.m.

Python

20 +0

642 +1

99 +0

GitHub
spark-daria by mrpowers-io

Essential Spark extensions and helper methods ✨😲

created at Feb. 16, 2017, 3:41 p.m.

Scala

34 +0

754 +0

152 +0

GitHub
neo4j-spark-connector by neo4j

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs

created at March 3, 2016, 4:01 p.m.

Scala

34 +0

313 +0

112 +0

GitHub
spark-connect-go by apache

Apache Spark Connect Client for Golang

created at May 30, 2023, 10:09 a.m.

Go

25 +0

161 +2

32 +0

GitHub
spark-connect-csharp by mdrakiburrahman

Apache Spark Connect Client for C#

created at April 14, 2024, 11:40 p.m.

C#

2 +0

1 +0

0 +0

GitHub
spark-connect-rs by sjrusso8

Apache Spark Connect Client for Rust

created at Sept. 18, 2023, 1:32 p.m.

Rust

5 +0

90 +0

15 +0

GitHub
iceberg by apache

Apache Iceberg

created at Nov. 19, 2018, 4:26 p.m.

Java

160 +0

6,464 +20

2,235 +10

GitHub
hudi by apache

Upserts, Deletes And Incremental Processing on Big Data.

created at Dec. 14, 2016, 3:53 p.m.

Java

1,164 +1

5,436 +21

2,424 -1

GitHub
chispa by MrPowers

PySpark test helper methods with beautiful error messages

created at March 19, 2019, 3:52 p.m.

Python

5 +0

620 +3

68 +0

GitHub
python-deequ by awslabs

Python API for Deequ

created at Nov. 9, 2020, 9:28 p.m.

Jupyter Notebook

17 +0

730 +3

136 +1

GitHub