weave by weaveworks

Simple, resilient multi-host containers networking and more.

created at Aug. 18, 2014, 5:19 a.m.

Go

237 +0

6,577 +2

662 +0

GitHub
gpdb by greenplum-db

Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.

created at Oct. 23, 2015, 12:25 a.m.

C

419 +0

6,197 -1

1,702 +5

GitHub
kryo by EsotericSoftware

Java binary serialization and cloning: fast, efficient, automatic

created at Nov. 6, 2013, 1:24 p.m.

HTML

297 +0

6,067 +6

817 +1

GitHub
snappy by google

A fast compressor/decompressor

created at March 3, 2014, 9:58 p.m.

C++

195 +0

5,984 +7

967 +1

GitHub
kcat by edenhill

Generic command line non-JVM Apache Kafka producer and consumer

created at March 30, 2014, 4:25 a.m.

C

78 +2

5,240 +12

471 +0

GitHub
opentsdb by OpenTSDB

A scalable, distributed Time Series Database.

created at Aug. 27, 2010, 2:05 a.m.

Java

337 +0

4,948 +1

1,251 +0

GitHub
zombodb by zombodb

Making Postgres and Elasticsearch work together like it's 2023

created at July 17, 2015, 4:53 p.m.

PLpgSQL

95 +0

4,608 +1

209 +0

GitHub
lakeFS by treeverse

lakeFS - Data version control for your data lake | Git for data

created at Sept. 12, 2019, 11:46 a.m.

Go

40 +0

4,059 +6

329 +2

GitHub
rudder-server by rudderlabs

Privacy and Security focused Segment-alternative, in Golang and React

created at July 19, 2019, 9:24 a.m.

Go

61 +0

3,923 +11

289 +3

GitHub
aws-sdk-pandas by aws

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

created at Feb. 26, 2019, 1:39 a.m.

Python

61 +0

3,799 +8

666 +2

GitHub
heka by mozilla-services

DEPRECATED: Data collection and processing made easy.

created at Oct. 16, 2012, 5:20 p.m.

Go

204 +0

3,399 +0

531 +0

GitHub
flocker by ClusterHQ

Container data volume manager for your Dockerized application

created at April 28, 2014, 6:02 p.m.

Python

168 +0

3,376 +0

286 +0

GitHub
flockdb by twitter-archive

A distributed, fault-tolerant graph database

created at April 12, 2010, 3:53 a.m.

Scala

279 +0

3,328 +1

273 +0

GitHub
smart_open by piskvorky

Utils for streaming large files (S3, HDFS, gzip, bz2...)

created at Jan. 2, 2015, 1:05 p.m.

Python

49 +0

3,087 +3

378 +1

GitHub
elasticsearch-jdbc by jprante

JDBC importer for Elasticsearch

created at June 2, 2012, 11:17 p.m.

Java

231 +0

2,839 -1

712 +0

GitHub
kafka-node by SOHU-Co

Node.js client for Apache Kafka 0.8 and later.

created at Oct. 23, 2013, 3:34 a.m.

JavaScript

99 +0

2,658 -1

630 +0

GitHub
pipelinedb by pipelinedb

High-performance time-series aggregation for PostgreSQL

created at Nov. 26, 2013, 12:11 a.m.

C

106 +0

2,614 +0

237 +0

GitHub
pyxley by stitchfix

Python helpers for building dashboards using Flask and React

created at June 22, 2015, 10:23 p.m.

JavaScript

277 +0

2,276 -1

258 +0

GitHub
gobblin by apache

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

created at Dec. 1, 2014, 6:10 p.m.

Java

167 +0

2,190 +0

742 +0

GitHub
secor by pinterest

Secor is a service implementing Kafka log persistence

created at April 15, 2014, 10:26 p.m.

Java

70 +0

1,834 +2

541 +1

GitHub