joblib by joblib

Computing with Python functions.

updated at May 11, 2024, 9 a.m.

Python

61 +0

3,679 +9

405 +3

GitHub
hail by hail-is

Cloud-native genomic dataframes and batch computing

updated at May 11, 2024, 1:20 p.m.

Python

55 +0

938 +0

235 +0

GitHub
spark-nlp by JohnSnowLabs

State of the Art Natural Language Processing

updated at May 11, 2024, 9:33 p.m.

Scala

100 +0

3,708 +9

702 +1

GitHub
deequ by awslabs

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

updated at May 11, 2024, 11:29 p.m.

Scala

80 +0

3,140 +6

514 +0

GitHub
mleap by combust

MLeap: Deploy ML Pipelines to Production

updated at May 12, 2024, 1 a.m.

Scala

69 +0

1,496 +2

313 +0

GitHub
scikit-learn by scikit-learn

scikit-learn: machine learning in Python

updated at May 12, 2024, 2:16 a.m.

Python

2,141 +0

58,265 +63

25,004 +18

GitHub
ipex-llm by intel-analytics

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). A PyTorch LLM library that seamlessly integrates with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, etc.

updated at May 12, 2024, 3:48 a.m.

Python

242 +0

6,049 +48

1,204 +2

GitHub
mongo-spark by mongodb

The MongoDB Spark Connector

updated at May 12, 2024, 6:15 a.m.

Java

79 +0

702 -1

307 +0

GitHub