storehaus
storehaus copied to clipboard
StorehausOutputFormat
I'm thinking of a Hadoop output format for generating many Storehaus persistences.
The output format would:
- accept a number of shards and a shard function,
- assign a shard to each key,
- sort by (shard, key)
- creates N Stores on disk.
Paired with #47, and the VersionedTap in dfs-datastores, each output Store would be a proper VersionedStore.
Relevant: https://github.com/avibryant/rdb
Awesome, thanks dude.