dqo by dqops

Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

updated at May 18, 2024, 2:49 p.m.

Java

5 +0

56 +1

12 +0

GitHub
DataProfiler by capitalone

What's in your data? Extract schema, statistics and entities from datasets

updated at May 18, 2024, 3:05 p.m.

Python

21 +0

1,369 +5

156 +0

GitHub
kafka-docker by wurstmeister

Dockerfile for Apache Kafka

updated at May 18, 2024, 3:33 p.m.

Shell

161 +0

6,867 +5

2,736 +2

GitHub
hstream by hstreamdb

HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications.

updated at May 18, 2024, 4:01 p.m.

Haskell

23 +0

692 +0

56 +0

GitHub
kcat by edenhill

Generic command line non-JVM Apache Kafka producer and consumer

updated at May 18, 2024, 7:11 p.m.

C

78 +0

5,282 +4

473 +0

GitHub
metabase by metabase

The simplest, fastest way to get business intelligence and analytics to everyone in your company yum

updated at May 18, 2024, 7:53 p.m.

Clojure

643 +0

36,809 +121

4,881 +10

GitHub
scylladb by scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra

updated at May 18, 2024, 10:19 p.m.

C++

337 -3

12,659 +25

1,215 +3

GitHub
influxdb by influxdata

Scalable datastore for metrics, events, and real-time analytics

updated at May 18, 2024, 10:21 p.m.

Rust

736 -2

27,881 +31

3,490 +0

GitHub
nessie by projectnessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

updated at May 18, 2024, 11:06 p.m.

Java

27 +0

858 +7

116 +0

GitHub
pace by getstrm

Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.

updated at May 19, 2024, 12:02 a.m.

Kotlin

3 +0

31 +0

0 +0

GitHub
prometheus by prometheus

The Prometheus monitoring system and time series database.

updated at May 19, 2024, 12:37 a.m.

Go

1,132 +8

53,086 +104

8,796 +24

GitHub
luigi by spotify

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

updated at May 19, 2024, 1:11 a.m.

Python

473 +0

17,382 +19

2,374 +0

GitHub
dash by plotly

Data Apps & Dashboards for Python. No JavaScript Required.

updated at May 19, 2024, 2:10 a.m.

Python

419 +0

20,611 +29

1,993 +2

GitHub
tidb by pingcap

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://tidbcloud.com/free-trial

updated at May 19, 2024, 4:06 a.m.

Go

1,265 -2

36,274 +49

5,720 +2

GitHub
protobuf by protocolbuffers

Protocol Buffers - Google's data interchange format

updated at May 19, 2024, 4:23 a.m.

C++

2,056 +0

63,854 +61

15,291 +13

GitHub
superset by apache

Apache Superset is a Data Visualization and Data Exploration Platform

updated at May 19, 2024, 4:59 a.m.

TypeScript

1,498 +0

59,473 +405

12,709 +48

GitHub
airflow by apache

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

updated at May 19, 2024, 5 a.m.

Python

754 -2

34,730 +86

13,623 +27

GitHub
datacompy by capitalone

Pandas and Spark DataFrame comparison for humans and more!

updated at May 19, 2024, 5:17 a.m.

Python

25 +0

397 +3

122 +0

GitHub
cadvisor by google

Analyzes resource usage and performance characteristics of running containers.

updated at May 19, 2024, 6:41 a.m.

Go

387 +1

16,434 +43

2,278 -1

GitHub
rqlite by rqlite

The lightweight, distributed relational database built on SQLite.

updated at May 19, 2024, 7:38 a.m.

Go

227 +0

14,971 +39

686 +4

GitHub