registry by hortonworks

Schema Registry

created at Oct. 26, 2016, 8:28 a.m.

Java

206 +0

15 +0

8 +0

GitHub
schema-registry by confluentinc

Confluent Schema Registry for Kafka

created at Dec. 9, 2014, 10:38 p.m.

Java

379 +0

2,225 +1

1,114 +1

GitHub
gobblin by apache

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

created at Dec. 1, 2014, 6:10 p.m.

Java

165 +0

2,228 +0

751 +1

GitHub
oryx by OryxProject

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

created at July 25, 2014, 8:08 p.m.

Java

208 +0

1,788 +0

405 +0

GitHub
ankush by Impetus

A big data cluster management tool that creates and manages clusters of different technologies.

created at May 29, 2014, 10:05 a.m.

Java

13 +0

21 +0

17 +0

GitHub
flume-udp-source by whitepages

Apache Flume source plugin allowing direct consumption of UDP messages

created at March 18, 2014, 11:32 p.m.

Java

4 +0

8 +0

9 +0

GitHub
Beetest by kawaa

A super simple utility for testing Apache Hive scripts locally for non-Java developers.

created at Dec. 7, 2013, 6:17 p.m.

Java

8 +0

72 +0

23 +0

GitHub
HiveRunner by HiveRunner

An Open Source unit test framework for Hive queries based on JUnit 4 and 5

created at Nov. 22, 2013, 9:19 a.m.

Java

34 +0

255 +0

77 +0

GitHub
haeinsa by VCNC

Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase

created at Aug. 10, 2013, 3:43 p.m.

Java

30 +0

158 +0

42 +0

GitHub
hindex by Huawei-Hadoop

Secondary Index for HBase

created at Aug. 8, 2013, 11:33 a.m.

Java

134 +0

591 +0

286 +0

GitHub
genie by Netflix

Distributed Big Data Orchestration Service

created at June 20, 2013, 8:35 p.m.

Java

528 +1

1,716 +0

369 +0

GitHub
Hive-Cassandra by dvasilen

Hive Storage Handler for Cassandra (cloned from https://github.com/riptano/hive/tree/hive-0.8.1-merge/cassandra-handler)

created at April 12, 2013, 7:33 p.m.

Java

9 +0

15 +0

14 +0

GitHub
suro by Netflix

Netflix's distributed Data Pipeline

created at March 20, 2013, 9:02 p.m.

Java

514 +1

794 +0

171 +0

GitHub
elasticsearch-hadoop by elastic

elephant Elasticsearch real-time search and analytics natively integrated with Hadoop

created at March 11, 2013, 6:57 p.m.

Java

180 +1

9 +0

990 +0

GitHub
hive-solr by chimpler

Hive Storage Handler for SOLR

created at March 8, 2013, 1:41 a.m.

Java

10 +0

16 +0

26 +0

GitHub
accumulo-hive-storage-manager by bfemiano

Working commits for Hive connector to Accumulo. This will eventually be checked directly into Accumulo.

created at March 2, 2013, 10:57 a.m.

Java

6 +0

13 +0

12 +0

GitHub
white-elephant by LinkedInAttic

Hadoop log aggregator and dashboard

created at Jan. 24, 2013, 11:26 p.m.

Java

97 +0

192 +0

62 +0

GitHub
flume-ng-mongodb-sink by leonlee

Flume NG MongoDB source.

created at Sept. 28, 2012, 2:15 a.m.

Java

13 +0

71 +0

62 +0

GitHub
mpich2-yarn by alibaba

Running MPICH2 on Yarn

created at Aug. 23, 2012, 3:57 a.m.

Java

34 +0

114 +0

62 +0

GitHub
flume-ng-rabbitmq by jcustenborder

Flume plugin for RabbitMQ

created at June 13, 2012, 8:22 p.m.

Java

10 +0

58 +0

46 +0

GitHub