Nessie: Transactional Catalog for Data Lakes with Git-like semantics
updated at May 26, 2024, 10:48 p.m.
Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.
updated at May 26, 2024, 4:34 p.m.
The Prometheus monitoring system and time series database.
updated at May 26, 2024, 4:07 p.m.
What's in your data? Extract schema, statistics and entities from datasets
updated at May 26, 2024, 7:23 a.m.
Pandas and Spark DataFrame comparison for humans and more!
updated at May 26, 2024, 7:20 a.m.
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
updated at May 26, 2024, 4:12 a.m.
An orchestration platform for the development, production, and observation of data assets.
updated at May 26, 2024, 3:57 a.m.
Simple, resilient multi-host containers networking and more.
updated at May 26, 2024, 1:48 a.m.