pyspark-stubs by zero323

Apache (Py)Spark type annotations (stub files).

updated at Sept. 16, 2023, 6:30 p.m.

Python

6 +0

114 +0

37 +0

GitHub
sparkly by Tubular

Helpers & syntactic sugar for PySpark.

updated at Dec. 22, 2023, 2:37 a.m.

Python

38 +0

60 +0

7 +0

GitHub
spark-sklearn by databricks

(Deprecated) Scikit-learn integration package for Apache Spark

updated at April 17, 2024, 4:13 a.m.

Python

94 +0

1,077 +0

231 +0

GitHub
sparkmagic by jupyter-incubator

Jupyter magics and kernels for working with remote Spark clusters

updated at May 3, 2024, 11:02 p.m.

Python

49 +0

1,287 +0

438 +0

GitHub
flintrock by nchammas

A command-line tool for launching Apache Spark clusters.

updated at May 4, 2024, 11:07 a.m.

Python

33 +0

631 +0

114 +0

GitHub
blaze by blaze

NumPy and Pandas interface to Big Data

updated at May 5, 2024, 3:19 a.m.

Python

195 +0

3,179 -1

393 +0

GitHub
quinn by MrPowers

pyspark methods to enhance developer productivity 📣 👯 🎉

updated at May 7, 2024, 3:46 p.m.

Python

19 +0

581 +1

91 +0

GitHub
joblib-spark by joblib

Joblib Apache Spark Backend

updated at May 8, 2024, 11:19 a.m.

Python

9 +0

238 +1

26 +0

GitHub
dist-keras by cerndb

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

updated at May 10, 2024, 5:12 a.m.

Python

49 +0

623 -1

170 +0

GitHub
koalas by databricks

Koalas: pandas API on Apache Spark

updated at May 11, 2024, 3:34 a.m.

Python

316 +0

3,321 +0

355 +0

GitHub
joblib by joblib

Computing with Python functions.

updated at May 11, 2024, 9 a.m.

Python

61 +0

3,679 +9

405 +3

GitHub
hail by hail-is

Cloud-native genomic dataframes and batch computing

updated at May 11, 2024, 1:20 p.m.

Python

55 +0

938 +0

235 +0

GitHub
scikit-learn by scikit-learn

scikit-learn: machine learning in Python

updated at May 12, 2024, 2:16 a.m.

Python

2,141 +0

58,265 +63

25,004 +18

GitHub
ipex-llm by intel-analytics

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, etc.

updated at May 12, 2024, 3:48 a.m.

Python

242 +0

6,049 +48

1,204 +2

GitHub