scylladb by scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra

created at Dec. 24, 2014, 1:16 p.m.

C++

340 +0

12,591 +32

1,208 +5

GitHub
snappy by google

A fast compressor/decompressor

created at March 3, 2014, 9:58 p.m.

C++

195 +0

5,995 +6

969 +1

GitHub
kryo by EsotericSoftware

Java binary serialization and cloning: fast, efficient, automatic

created at Nov. 6, 2013, 1:24 p.m.

HTML

296 -1

6,080 +6

817 +0

GitHub
gobblin by apache

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

created at Dec. 1, 2014, 6:10 p.m.

Java

167 +0

2,190 +0

742 +0

GitHub
elasticsearch-jdbc by jprante

JDBC importer for Elasticsearch

created at June 2, 2012, 11:17 p.m.

Java

231 +0

2,838 +0

711 -1

GitHub
rqlite by rqlite

The lightweight, distributed relational database built on SQLite.

created at Aug. 23, 2014, 4:31 a.m.

Go

228 +0

14,909 +33

681 +1

GitHub
aws-sdk-pandas by aws

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

created at Feb. 26, 2019, 1:39 a.m.

Python

61 +0

3,805 +3

668 +1

GitHub
weave by weaveworks

Simple, resilient multi-host containers networking and more.

created at Aug. 18, 2014, 5:19 a.m.

Go

237 +0

6,584 +4

662 -1

GitHub
kafka-node by SOHU-Co

Node.js client for Apache Kafka 0.8 and later.

created at Oct. 23, 2013, 3:34 a.m.

JavaScript

99 +0

2,659 +1

630 +0

GitHub
PyHive by dropbox

Python interface to Hive and Presto. 🐝

created at Feb. 1, 2014, 9:05 a.m.

Python

62 +0

1,665 +0

552 +0

GitHub
secor by pinterest

Secor is a service implementing Kafka log persistence

created at April 15, 2014, 10:26 p.m.

Java

70 +0

1,835 +0

541 +0

GitHub
heka by mozilla-services

DEPRECATED: Data collection and processing made easy.

created at Oct. 16, 2012, 5:20 p.m.

Go

204 +0

3,399 +0

531 +0

GitHub
kcat by edenhill

Generic command line non-JVM Apache Kafka producer and consumer

created at March 30, 2014, 4:25 a.m.

C

79 +0

5,260 +14

473 +1

GitHub
ekuiper by lf-edge

Lightweight data stream processing engine for IoT edge

created at July 3, 2019, 7:37 a.m.

Go

41 +0

1,365 +1

387 +5

GitHub
smart_open by piskvorky

Utils for streaming large files (S3, HDFS, gzip, bz2...)

created at Jan. 2, 2015, 1:05 p.m.

Python

49 +0

3,094 +1

378 +0

GitHub
Gaffer by gchq

A large-scale entity and relation database supporting aggregation of properties

created at Dec. 14, 2015, 12:12 p.m.

Java

142 +0

1,734 +1

354 +0

GitHub
kairosdb by kairosdb

Fast scalable time series database

created at Feb. 5, 2013, 10:27 p.m.

Java

118 +0

1,726 +2

344 -1

GitHub
lakeFS by treeverse

lakeFS - Data version control for your data lake | Git for data

created at Sept. 12, 2019, 11:46 a.m.

Go

40 +0

4,083 +17

329 +0

GitHub
ccm by riptano

A script to easily create and destroy an Apache Cassandra cluster on localhost

created at March 1, 2011, 9:42 a.m.

Python

76 +0

1,212 +0

302 +0

GitHub
rudder-server by rudderlabs

Privacy and Security focused Segment-alternative, in Golang and React

created at July 19, 2019, 9:24 a.m.

Go

61 +0

3,940 +8

289 +1

GitHub