arrow
                                
                                 arrow copied to clipboard
                                
                                    arrow copied to clipboard
                            
                            
                            
                        ARROW-17290: [C++] Add order-comparisons for numeric scalars
See https://issues.apache.org/jira/browse/ARROW-17290
https://issues.apache.org/jira/browse/ARROW-17290
:warning: Ticket has not been started in JIRA, please click 'Start Progress'.
@pitrou, could you review?
cc @icexelloss
@bkietz Also curious about your opinion on this.
@rtpsw Do we need this for the AsOfJoin improvement?
I developed this PR's code while experimenting with designs for AsOfJoin; however, the design I chose to proceed with does not need this PR's code. So, I'm fine if this doesn't go through. I just figured I'd share the code to see what people think. I noticed that Arrow does not implement ordering for numeric Scalar types, but I wasn't aware of the considerations that @bkietz describes - perhaps this means the corresponding Jira, and not just this PR, is being questioned. The bit twiddling idea for ordering floats crossed my mind, but I opted to start with something simpler.
If this is intended for use in AsofJoin then we should definitely not be using Scalars at all since extracting them from a join key column, boxing them in the type-erased Scalar, then unboxing them again to do type check and comparison would be wasteful compared to doing the comparisons in-place against elements of each column.
I experimented with adding support for updating a Scalar in-place, which potentially could have avoided the above costs in my code, but ran into implementation problems (changes needed in too many places across Arrow) and decided to give up on it.
Tangent: I don't know how applicable this will be, but at least for testing purposes it might be handy to use something like Rust's float_ord to cast floating point numbers into something less subtle to sort. For example: https://gist.github.com/bkietz/8e2ef182883b886e532ffde8e537f7a3
These resources are interesting - thanks for sharing! I also found http://stereopsis.com/radix.html (linked from https://docs.rs/crate/float-ord/latest) interesting.
@rtpsw Do we need this for the AsOfJoin improvement?
No.
Just to clarify, I'm holding on looking into specific comments on this PR until and if we decide to move forward with it.
If we don't currently or in the near future require support for comparison of Scalars in the way this PR provides, then I'd say we should close this PR and mark the JIRA wontfix.