asterixdb
asterixdb copied to clipboard
TestHistogram zipfan dataset as well the interfaces of the sort based…
… joins
The parallel sort related elements, mainly includes:
- parallel sort running frameworks including the histograms, histogrammerge, forward operators.
- two types of histogram and their inner algorithms covering streaming based numeric histogram and ternary based string histogram.
- provides fours samplers, bernoulli, reservoir, random and chain for further optimizations.
- Some testSet and testCodes. Further works: Split the types of implementations separately into two sub-branches to accommodate the basic types of parallel sort.
- Running framework as well as the numeric type.
- Extend the framework onto the string case. Make the inner structures according to the normal rules of hyracks. Change-Id: I8eb7f0dddcd4b754b1cbe273ef8db5be966654d5