elephant-bird by twitter

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.

created at March 25, 2010, 1:49 a.m.

Java

189 -1

1,137 +0

390 +0

GitHub
YCSB by brianfrankcooper

Yahoo! Cloud Serving Benchmark

created at April 19, 2010, 8:52 p.m.

Java

215 +0

4,812 +9

2,196 +2

GitHub
akela by mozilla-metrics

A bunch of utility classes for Java, Hadoop, HBase, Pig, etc.

created at Dec. 11, 2010, 12:36 a.m.

Java

23 +0

76 +0

31 +0

GitHub
HiveSwarm by livingsocial

Helpful user defined fuctions / table generating functions for Hive

created at April 5, 2011, 5:46 p.m.

Java

66 +0

101 +0

46 +0

GitHub
Hive-Extensions-from-Think-Big-Analytics by ThinkBigAnalytics

Reusable code for Hive

created at April 6, 2011, 1:45 a.m.

Java

316 +0

16 +0

14 +0

GitHub
varaha by thedatachef

Machine learning and natural language processing with Apache Pig

created at April 25, 2011, 3:39 a.m.

Java

9 +0

53 +0

15 +0

GitHub
gdata-storagehandler by balshor

A Hive StorageHandler that uses a Google Spreadsheet as a backend.

created at Aug. 19, 2011, 2:40 a.m.

Java

3 +0

14 +0

4 +0

GitHub
hive_test by edwardcapriolo

Unit test framework for hive and hive-service

created at Sept. 16, 2011, 2:39 p.m.

Java

18 +0

64 +1

47 +0

GitHub
Hive-mongo by yc-huang

hive storage handler for connecting with MongoDB

created at Nov. 17, 2011, 7:24 a.m.

Java

10 +0

32 -1

33 +0

GitHub
hive_cassandra_udfs by edwardcapriolo

User Defined Functions for Hive to work with Cassandra

created at Dec. 29, 2011, 3:24 p.m.

Java

4 +0

11 +0

5 +0

GitHub
ls-hive by lovelysystems

Lovely Systems Hive Goodies

created at Jan. 24, 2012, 3:12 p.m.

Java

16 +0

5 +0

2 +0

GitHub
HiBench by Intel-bigdata

HiBench is a big data benchmark suite.

created at June 12, 2012, 7:56 a.m.

Java

126 +0

1,433 +0

756 +0

GitHub
flume-ng-rabbitmq by jcustenborder

Flume plugin for RabbitMQ

created at June 13, 2012, 8:22 p.m.

Java

10 +0

59 +0

46 +0

GitHub
mpich2-yarn by alibaba

Running MPICH2 on Yarn

created at Aug. 23, 2012, 3:57 a.m.

Java

34 +0

114 +0

62 +0

GitHub
flume-ng-mongodb-sink by leonlee

Flume NG MongoDB source.

created at Sept. 28, 2012, 2:15 a.m.

Java

13 +0

71 +0

62 +0

GitHub
white-elephant by LinkedInAttic

Hadoop log aggregator and dashboard

created at Jan. 24, 2013, 11:26 p.m.

Java

97 +0

190 +0

63 +0

GitHub
accumulo-hive-storage-manager by bfemiano

Working commits for Hive connector to Accumulo. This will eventually be checked directly into Accumulo.

created at March 2, 2013, 10:57 a.m.

Java

6 +0

13 +0

12 +0

GitHub
hive-solr by chimpler

Hive Storage Handler for SOLR

created at March 8, 2013, 1:41 a.m.

Java

10 +0

16 -1

26 +0

GitHub
elasticsearch-hadoop by elastic

elephant Elasticsearch real-time search and analytics natively integrated with Hadoop

created at March 11, 2013, 6:57 p.m.

Java

488 -2

1,925 +1

981 +0

GitHub
suro by Netflix

Netflix's distributed Data Pipeline

created at March 20, 2013, 9:02 p.m.

Java

508 +0

789 +0

168 +0

GitHub