dfs-datastores Querying pail data files using Mapreduce

Querying pail data files using Mapreduce

Open pankug opened this issue 9 years ago • 0 comments

I am having a problem in Lambda Architecture, Our data stored in HDFS is in fact based pail format using Thrift Serialization schemes and vertical partitioning.

Is there any direct way we can query our data (residing in HDFS) in batch view, so we don't have to store our output data in ElephantDB or another database and we can directly view our data and stored it in a readable format.

Is there any example for querying using map-reduce , PIG or HIve without using cascalog.

We are facing this problem If u can help us regarding code structure and other techniques using any set of Big Data tool and language

Jan 14 '16 08:01 pankug

dfs-datastores dfs-datastores copied to clipboard

Querying pail data files using Mapreduce

dfs-datastores
dfs-datastores copied to clipboard