Ted Malaska

Results 8 repositories owned by Ted Malaska

CopybookInputFormat

18
Stars
16
Forks
Watchers

Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...

HBase-ToHDFS

27
Stars
25
Forks
Watchers

Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet

Spark.TableStatsExample

28
Stars
30
Forks
Watchers

Simple Spark example of generating table stats for use of data quality checks

SparkOnALog

19
Stars
21
Forks
Watchers

Examples of Integrating Spark Streaming, Flume, and HBase to solve Streaming problems

SparkOnHBase

21
Stars
17
Forks
Watchers

SparkOnKudu

52
Stars
46
Forks
Watchers

Based off the design of SparkOnHBase. This Repo will support Spark, Spark Streaming, and Spark SQL integration with Kudu.

SparkStreaming.Sessionization

50
Stars
42
Forks
Watchers

NRT Sessionization with Spark Streaming landing on HDFS and putting live stats in HBase

SparkUnitTestingExamples

35
Stars
29
Forks
Watchers

This project is a collection of Spark Unit Tests Examples to help new Spark users have good examples on how to unit start their code for Spark Core, Spark SQL, and Spark Streaming