highway
highway copied to clipboard
Does VQSort support custom object and comparison function ?
After reading this blog, I am interested in this super fast VQSort()
and want to use it to replace the spec_qsort()
function inside mcf benchmark.
However, instead of comparing values, the spec_qsort()
function sorts custom objects using a compare function which is passed as one of function arguments.
May I ask how does VQSort()
compare custom object ? Any documentation or sample I should read ?
Hi, VQSort currently does not support custom comparators. As a workaround, if the comparator is just a lexicographic ordering of up to 128 bits of data, you can use the K64V64 type. It is possible to implement a vectorized comparator, but that would require large changes to the sorting network.
Thanks Dr.@jan-wassenberg for the reply. I think the custom comparator just compares a value in lexicographic ordering, but it requires deferencing pointers and branches. The branches in the comparator might prevent vectorization. But my target machine supports ARM SVE with scalable predicate registers. Could you hint at me what are the non-trivial changes to replace the spec_qsort()
with VQSort()
?
FYI
// the struct to compare
typedef struct arc arc_t;
struct arc
{
int32_t id;
int32_t cost;
int64_t tail, head;
int16_t ident;
int64_t flow;
};
// the definition of spec_qsort() function
void spec_qsort(void *array, size_t nitems, size_t size, int (*cmp)(const void*,const void*));
1. void *array : an array of arc_t**.
2. size_t nitems: numbers of arc_t** in the array.
3. size_t size: sizeof(arc_t*)
4. function pointer of the comparator.
// the comparator
static int arc_compare( arc_t **a1, arc_t **a2 )
{
if( (*a1)->flow > (*a2)->flow )
return 1;
if( (*a1)->flow < (*a2)->flow )
return -1;
if( (*a1)->id < (*a2)->id )
return -1;
return 1;
}
The nontrivial change would be to write a vectorized comparator, perhaps using gather to load from the pointers. Then you'd also have to change the sorting networks from min+max to comparator+IfThenElse.
The current approach of sorting pointers makes sense if objects are quite large. This one is not really big. If you can bit-pack everything into 128 bits, then sorting the objects directly (casted to K64V64 type) would likely be faster.
Closing, feel free to reopen if you'd like to continue the discussion.