WritableChunk sort documentation and implementation clarification
As noted in #5820, WritableDoubleChunk#sort uses Arrays#sort(double[], ...) which does not sort NULL_DOUBLEs first; -Infinity is sorted earlier.
It also appears that WritableCharChunk#sort is incorrect when offset != 0
This may not be a bug, as it is "consistent" with the javadocs:
/**
* Sort this chunk in-place using Java's primitive defined ordering.
* <p>
* Of note is that nulls or NaNs are not sorted according to Deephaven ordering rules.
*/
void sort();
In that case though, WritableCharChunk is incorrect because it does sort nulls first.
Of note is that nulls or NaNs are not sorted according to Deephaven ordering rules.
Afaict, Arrays.sort does sort NaNs in the same way DH does - NaNs come last.
Removing the SortFixup region from WritableCharChunk does not cause any tests to fail (normal, nor nightly). This surprises me very much; either, code downstream of WritableCharChunk.sort()
- does their own fixup (not assuming that WritableCharChunk does it for them)
- doesn't care about specifics of char sort (only cares about like values grouped together)
or, our testing in this regard is lacking (and we do have operations that depend on the current implementation of WritableCharChunk.sort).
https://iceberg.apache.org/spec/#sorting is relevant info on how iceberg handles double. We may want to create an issue more generally about floating point (in)consistencies in DH.