sparkling-water by h2oai

Sparkling Water provides H2O functionality inside Spark cluster

updated at May 14, 2024, 9:07 a.m.

Scala

179 +0

952 +1

363 +0

GitHub
delight by datamechanics

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

updated at May 15, 2024, 3:16 p.m.

Scala

16 +0

336 +1

51 +0

GitHub
aerosolve by airbnb

A machine learning package built for humans.

updated at May 17, 2024, 5:41 a.m.

Scala

351 +0

4,793 +2

567 +0

GitHub
scalding by twitter

A Scala API for Cascading

updated at May 17, 2024, 4:40 p.m.

Scala

322 -1

3,471 -1

703 +0

GitHub
summingbird by twitter

Streaming MapReduce with Scalding and Storm

updated at May 18, 2024, 4:28 a.m.

Scala

292 +0

2,135 -3

267 +0

GitHub
factorie by factorie

FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.

updated at May 18, 2024, 4:32 a.m.

Scala

70 +0

552 -1

145 +0

GitHub
spark by apache

Apache Spark - A unified analytics engine for large-scale data processing

updated at May 18, 2024, 8:46 p.m.

Scala

2,030 -1

38,531 +57

27,979 +10

GitHub
spark-nlp by JohnSnowLabs

State of the Art Natural Language Processing

updated at May 19, 2024, 4:15 a.m.

Scala

100 +0

3,716 +8

702 +0

GitHub
SynapseML by Microsoft

Simple and Distributed Machine Learning

updated at May 19, 2024, 7:27 a.m.

Scala

146 +0

4,984 +9

818 +3

GitHub