dfs-datastores icon indicating copy to clipboard operation
dfs-datastores copied to clipboard

ClassFiles for Consolidator

Open rahulrv1980 opened this issue 11 years ago • 0 comments

Hello,

I am new to using this dfs-datastores library. I am attempting to use Consolidator to aggregate a lot of small files on hadoop into larger files. I had a noob question. Consolidator is modelled as an API. However, the jar which contains the classes which implement the PathLister and RecordStreamFactory will also need to be made available within the MR job by a different mechanism (DistributedCache or otherwise). Is that the right way of using Consolidator? We would need a seperate deploy step which would deploy this Jar (contained PathLister and RecordStreamFactory) seperately (since consolidator does not have a hook for adding it to the DistributedCache)?

Thanks, ~Rahul

rahulrv1980 avatar Mar 20 '13 17:03 rahulrv1980