Accelerate tree traversal in auto collision problems
Fix #409
Something else to mention. For this algorithm to work, the ordering of the primitives and predicates must be the same. Otherwise, edges may be missed.
There are two ways we can guarantee this:
- The easiest thing to guarantee that would be to store the permutation produced for primitives, and reapply it for predicates.
- An alternative approach is to use a stable sort for sorting Morton indices (we do not do right now). The reason is that if there are duplicated Morton indices, their ordering is not guaranteed in non-stable sort and may depend on runtime.
I rebased the branch on the current master. With the recent changes, the patch is really small now.
In my previous comment, I proposed two ways to address a potential issue of not having the same permutation for the leaf nodes and queries, leading to subtle bugs. I completely missed a third alternative, which is much cleaner.
Specifically, when using queries with a quick traversal callback, we should construct a permutation using the permutation indices stored in leaf nodes, and apply it to queries. This a) avoids calling unstable sort, and b) avoids storing the permutation as a separate entity in BVH. It should be of small cost, as it's O(n) operation through a single parallel_for() call, and should even be faster than computing Morton indices and sorting as we do right now.
@dalg24 I'm not 100% certain what's the best way to implement this. It seems like we need some extra function in BVH which will construct a permutation either through calling sortObjects, or by going through leaf nodes. This function will have to be specialized for the signature of the callback. Thoughts about where this function should live?
Of note, sorting queries based on leaf node permutation should be done for all self-collision problems, and should not require the callback to be half-traversal. So we would want some indication from a user whether a problem is self-collision. On the other hand, half-traversal require this permutation.
It just occurred to me that it is possible to see this approach from a different angle.
Consider assigning each leaf node a label, and then reducing them to internal nodes with a max operator (similar to #600, but we did a different operator there). Then, each internal node knows the max label of all of its children. If we assign a label to query, we can ask it to bypass all nodes where the max label is less than the query label. The only challenge would be to assign the node labels based on permutation of the leaf nodes.
I think that overall we should strive to implement a general data reduction algorithm on a tree with custom reducers. I think this would open many interesting algorithmic options, particularly custom traversal with skips.
Closing in favor of some version of #801.