kafkat by airbnb

KafkaT-ool

updated at May 2, 2024, 4:43 p.m.

Ruby

243 +0

503 -1

86 +0

GitHub
kafka-docker by wurstmeister

Dockerfile for Apache Kafka

updated at May 2, 2024, 8:13 p.m.

Shell

160 +0

6,848 +7

2,717 -2

GitHub
gobblin by apache

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

updated at May 2, 2024, 9:14 p.m.

Java

167 +0

2,190 +0

742 +0

GitHub
datacompy by capitalone

Pandas and Spark DataFrame comparison for humans and more!

updated at May 3, 2024, 8:43 a.m.

Python

25 +0

389 +6

122 +2

GitHub
dqo by dqops

Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

updated at May 3, 2024, 4:46 p.m.

Java

5 +0

54 +1

11 +0

GitHub
multiwoven by Multiwoven

🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.

updated at May 3, 2024, 5:33 p.m.

Ruby

12 +0

637 +14

30 +2

GitHub
kairosdb by kairosdb

Fast scalable time series database

updated at May 3, 2024, 7:05 p.m.

Java

118 +0

1,726 +2

344 -1

GitHub
pyxley by stitchfix

Python helpers for building dashboards using Flask and React

updated at May 3, 2024, 8:04 p.m.

JavaScript

277 +0

2,275 -1

258 +0

GitHub
tidb by pingcap

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://tidbcloud.com/free-trial

updated at May 4, 2024, 1:17 a.m.

Go

1,270 -1

36,170 +30

5,715 -1

GitHub
pace by getstrm

Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.

updated at May 4, 2024, 3:40 a.m.

Kotlin

3 +0

31 +0

0 +0

GitHub
weave by weaveworks

Simple, resilient multi-host containers networking and more.

updated at May 4, 2024, 5:07 a.m.

Go

237 +0

6,584 +4

662 -1

GitHub
kryo by EsotericSoftware

Java binary serialization and cloning: fast, efficient, automatic

updated at May 4, 2024, 7:35 a.m.

HTML

296 -1

6,080 +6

817 +0

GitHub
nessie by projectnessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

updated at May 4, 2024, 9:33 a.m.

Java

27 -1

841 +7

116 +1

GitHub
ekuiper by lf-edge

Lightweight data stream processing engine for IoT edge

updated at May 4, 2024, 10:26 a.m.

Go

41 +0

1,365 +1

387 +5

GitHub
pg_kafka by xstevens

INACTIVE: A PostgreSQL extension to produce messages to Apache Kafka.

updated at May 4, 2024, 10:28 a.m.

C

9 +0

112 -1

15 +0

GitHub
bottledwater-pg by confluentinc

Change data capture from PostgreSQL into Kafka

updated at May 4, 2024, 10:28 a.m.

C

366 +0

1,524 -2

155 +0

GitHub
gpdb by greenplum-db

Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.

updated at May 4, 2024, 10:31 a.m.

C

418 -1

6,203 +3

1,702 -1

GitHub
snappy by google

A fast compressor/decompressor

updated at May 4, 2024, 12:57 p.m.

C++

195 +0

5,995 +6

969 +1

GitHub
kcat by edenhill

Generic command line non-JVM Apache Kafka producer and consumer

updated at May 4, 2024, 2:01 p.m.

C

79 +0

5,260 +14

473 +1

GitHub
rudder-server by rudderlabs

Privacy and Security focused Segment-alternative, in Golang and React

updated at May 4, 2024, 2:52 p.m.

Go

61 +0

3,940 +8

289 +1

GitHub