nessie by projectnessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

updated at April 21, 2024, 7:09 p.m.

Java

28 +0

831 +10

115 -1

GitHub
datacompy by capitalone

Pandas and Spark DataFrame comparison for humans and more!

updated at April 21, 2024, 3:19 p.m.

Python

25 +0

379 +2

118 +0

GitHub
lakeFS by treeverse

lakeFS - Data version control for your data lake | Git for data

updated at April 21, 2024, 2:42 p.m.

Go

40 +0

4,059 +6

329 +2

GitHub
influxdb by influxdata

Scalable datastore for metrics, events, and real-time analytics

updated at April 21, 2024, 2:27 p.m.

Rust

741 -2

27,718 +58

3,481 +0

GitHub
rqlite by rqlite

The lightweight, distributed relational database built on SQLite.

updated at April 21, 2024, 2:02 p.m.

Go

228 -1

14,853 +32

678 +3

GitHub
gpdb by greenplum-db

Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.

updated at April 21, 2024, 1:59 p.m.

C

419 +0

6,197 -1

1,702 +5

GitHub
DataProfiler by capitalone

What's in your data? Extract schema, statistics and entities from datasets

updated at April 21, 2024, 1:37 p.m.

Python

21 +0

1,359 +5

154 +0

GitHub
CMAK by yahoo

CMAK is a tool for managing Apache Kafka clusters

updated at April 21, 2024, 1:24 p.m.

Scala

534 +0

11,672 +11

2,496 +0

GitHub
kryo by EsotericSoftware

Java binary serialization and cloning: fast, efficient, automatic

updated at April 21, 2024, 12:03 p.m.

HTML

297 +0

6,067 +6

817 +1

GitHub
snappy by google

A fast compressor/decompressor

updated at April 21, 2024, 10:51 a.m.

C++

195 +0

5,984 +7

967 +1

GitHub
scylladb by scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra

updated at April 21, 2024, 8:41 a.m.

C++

341 +0

12,509 +44

1,200 +4

GitHub
superset by apache

Apache Superset is a Data Visualization and Data Exploration Platform

updated at April 21, 2024, 2:13 a.m.

TypeScript

1,497 +0

58,737 +162

12,536 +53

GitHub
seaweedfs by seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

updated at April 21, 2024, 1:55 a.m.

Go

535 +0

21,015 +58

2,164 +7

GitHub
pyxley by stitchfix

Python helpers for building dashboards using Flask and React

updated at April 21, 2024, 1:41 a.m.

JavaScript

277 +0

2,276 -1

258 +0

GitHub
dash by plotly

Data Apps & Dashboards for Python. No JavaScript Required.

updated at April 21, 2024, 1:40 a.m.

Python

416 -1

20,471 +39

1,985 +2

GitHub
metabase by metabase

The simplest, fastest way to get business intelligence and analytics to everyone in your company yum

updated at April 21, 2024, 1:38 a.m.

Clojure

643 +1

36,448 +86

4,853 +7

GitHub
nomad by hashicorp

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.

updated at April 21, 2024, 1:38 a.m.

Go

538 +0

14,408 +17

1,878 +4

GitHub
dagster by dagster-io

An orchestration platform for the development, production, and observation of data assets.

updated at April 21, 2024, 1:36 a.m.

Python

112 +0

10,180 +58

1,259 +6

GitHub
dqo by dqops

Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

updated at April 20, 2024, 11:29 p.m.

Java

5 +0

52 +0

11 +0

GitHub
airflow by apache

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

updated at April 20, 2024, 9:59 p.m.

Python

754 +2

34,425 +83

13,523 +18

GitHub