incubator-livy by apache

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

updated at May 25, 2024, 7:50 p.m.

Scala

57 +0

858 +1

594 +0

GitHub
scikit-learn by scikit-learn

scikit-learn: machine learning in Python

updated at May 25, 2024, 8:59 p.m.

Python

2,141 +0

58,415 +64

25,028 +10

GitHub
koalas by databricks

Koalas: pandas API on Apache Spark

updated at May 26, 2024, 2:57 a.m.

Python

318 +2

3,320 -1

355 +0

GitHub
SynapseML by Microsoft

Simple and Distributed Machine Learning

updated at May 26, 2024, 6:17 a.m.

Scala

146 +0

4,991 +7

819 +1

GitHub
spark-nlp by JohnSnowLabs

State of the Art Natural Language Processing

updated at May 26, 2024, 8:02 a.m.

Scala

100 +0

3,720 +4

704 +2

GitHub
itachi by yaooqinn

A library that brings useful functions from various modern database management systems to Apache Spark

updated at May 26, 2024, 8:52 a.m.

Scala

5 +0

53 -1

4 +0

GitHub
kyuubi by apache

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

updated at May 26, 2024, 8:54 a.m.

Scala

64 +2

1,962 +11

863 +2

GitHub
ipex-llm by intel-analytics

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, etc.

updated at May 26, 2024, 1:23 p.m.

Python

243 +1

6,099 +27

1,208 +0

GitHub