juicefs by juicedata

JuiceFS is a distributed POSIX file system built on top of Redis and S3.

updated at June 2, 2024, 10:46 a.m.

Go

NEW!

112 +0

9,928 +0

878 +0

GitHub
scylladb by scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra

updated at June 2, 2024, 10:38 a.m.

C++

337 +0

12,744 +39

1,229 +6

GitHub
cadvisor by google

Analyzes resource usage and performance characteristics of running containers.

updated at June 2, 2024, 9:54 a.m.

Go

387 +0

16,488 +28

2,279 +4

GitHub
dqo by dqops

Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

updated at June 2, 2024, 9:41 a.m.

Java

5 +0

64 +2

12 +0

GitHub
dagster by dagster-io

An orchestration platform for the development, production, and observation of data assets.

updated at June 2, 2024, 9:02 a.m.

Python

116 +0

10,472 +32

1,307 +3

GitHub
lakeFS by treeverse

lakeFS - Data version control for your data lake | Git for data

updated at June 2, 2024, 8:53 a.m.

Go

40 +0

4,128 +15

331 +1

GitHub
gobblin by apache

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

updated at June 2, 2024, 8:27 a.m.

Java

166 +0

2,199 +3

744 +1

GitHub
rqlite by rqlite

The lightweight, distributed relational database built on SQLite.

updated at June 2, 2024, 6:13 a.m.

Go

227 +0

15,058 +40

691 +3

GitHub
aws-sdk-pandas by aws

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

updated at June 2, 2024, 6:11 a.m.

Python

61 +1

3,826 +5

670 +0

GitHub
influxdb by influxdata

Scalable datastore for metrics, events, and real-time analytics

updated at June 2, 2024, 5:51 a.m.

Rust

736 +0

27,972 +54

3,493 +1

GitHub
seaweedfs by seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

updated at June 2, 2024, 5:49 a.m.

Go

536 +1

21,387 +63

2,193 +6

GitHub
prometheus by prometheus

The Prometheus monitoring system and time series database.

updated at June 2, 2024, 5:27 a.m.

Go

1,129 -1

53,299 +99

8,825 +12

GitHub
nessie by projectnessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

updated at June 2, 2024, 5:07 a.m.

Java

27 +0

872 +6

117 +1

GitHub
nomad by hashicorp

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.

updated at June 2, 2024, 3:59 a.m.

Go

534 -1

14,487 +8

1,910 +3

GitHub
druid by apache

Apache Druid: a high performance real-time analytics database.

updated at June 2, 2024, 3:24 a.m.

Java

589 -1

13,234 +6

3,647 +0

GitHub
multiwoven by Multiwoven

🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Customer Data Platform (CDP)

updated at June 2, 2024, 3:18 a.m.

Ruby

12 +0

666 +9

33 +2

GitHub
airflow by apache

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

updated at June 2, 2024, 2:21 a.m.

Python

753 -2

34,893 +82

13,666 +23

GitHub
protobuf by protocolbuffers

Protocol Buffers - Google's data interchange format

updated at June 2, 2024, 2:07 a.m.

C++

2,052 -1

64,097 +174

15,308 +11

GitHub
rudder-server by rudderlabs

Privacy and Security focused Segment-alternative, in Golang and React

updated at June 2, 2024, 1:53 a.m.

Go

61 +0

3,969 +9

297 +1

GitHub
superset by apache

Apache Superset is a Data Visualization and Data Exploration Platform

updated at June 2, 2024, 12:26 a.m.

TypeScript

1,500 +0

59,721 +128

12,789 +29

GitHub