spark by apache

Apache Spark - A unified analytics engine for large-scale data processing

updated at Dec. 1, 2024, 3:30 a.m.

Scala

2,022 -1

40,078 +58

28,355 +17

GitHub
algebird by twitter

Abstract Algebra for Scala

updated at Nov. 30, 2024, 8:50 a.m.

Scala

233 +0

2,288 -1

345 +0

GitHub
breeze by scalanlp

Breeze is/was a numerical processing library for Scala.

updated at Nov. 30, 2024, 8:50 a.m.

Scala

206 +0

3,448 +0

691 +0

GitHub
SynapseML by Microsoft

Simple and Distributed Machine Learning

updated at Nov. 30, 2024, 12:01 a.m.

Scala

146 +0

5,070 +2

831 +0

GitHub
scalding by twitter

A Scala API for Cascading

updated at Nov. 29, 2024, 9:05 p.m.

Scala

322 +0

3,505 +4

708 +0

GitHub
spark-nlp by JohnSnowLabs

State of the Art Natural Language Processing

updated at Nov. 29, 2024, 6:58 p.m.

Scala

101 +1

3,877 +1

712 +1

GitHub
delight by datamechanics

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

updated at Nov. 29, 2024, 4:34 p.m.

Scala

16 +0

343 +1

53 +0

GitHub
factorie by factorie

FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference.

updated at Nov. 29, 2024, 1:31 p.m.

Scala

69 +0

552 +0

144 +0

GitHub
predictionio by apache

PredictionIO, a machine learning server for developers and ML engineers.

updated at Nov. 28, 2024, 4:31 p.m.

Scala

756 +0

12,542 -2

1,927 +0

GitHub
aerosolve by airbnb

A machine learning package built for humans.

updated at Nov. 28, 2024, 4:31 p.m.

Scala

351 -1

4,794 -1

562 +0

GitHub
summingbird by twitter

Streaming MapReduce with Scalding and Storm

updated at Nov. 28, 2024, 4:30 p.m.

Scala

292 +0

2,135 -2

267 +0

GitHub
bioscala by bioscala

Bioinformatics for the Scala programming language

updated at Nov. 27, 2024, 7:54 a.m.

Scala

15 +0

109 +0

20 +0

GitHub
sparkling-water by h2oai

Sparkling Water provides H2O functionality inside Spark cluster

updated at Nov. 26, 2024, 5:11 p.m.

Scala

179 -1

967 -1

359 -1

GitHub
BIDMach by BIDData

CPU and GPU-accelerated Machine Learning Library

updated at Nov. 19, 2024, 1:46 p.m.

Scala

87 +0

915 +0

168 +0

GitHub
adam by bigdatagenomics

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

updated at Nov. 4, 2024, 1:06 a.m.

Scala

99 +0

1,003 +0

309 +0

GitHub
tensorflow_scala by eaplatanios

TensorFlow API for the Scala Programming Language

updated at Oct. 28, 2024, 9:40 p.m.

Scala

65 +0

939 +0

95 +0

GitHub
brushfire by stripe-archive

Distributed decision tree ensemble learning in Scala

updated at Oct. 21, 2024, 11:46 a.m.

Scala

94 +0

392 +0

50 +0

GitHub
DynaML by transcendent-ai-labs

Scala Library/REPL for Machine Learning Research

updated at Oct. 7, 2024, 8:47 p.m.

Scala

18 +0

201 +0

51 +0

GitHub
sarah-palin-lda by wavelets

Topic Modeling the Sarah Palin emails.

updated at Sept. 30, 2024, 10:37 p.m.

Scala

3 +0

9 +0

3 +0

GitHub
mist by Hydrospheredata

Serverless proxy for Spark cluster

updated at Sept. 15, 2024, 2:21 p.m.

Scala

39 +0

326 +0

68 +0

GitHub