DoubleSegmentedSortedMultiset infinite loop
Potential infinite loop. Easy to reproduce, although the reproducer is not minimized yet.
from deephaven.column import double_col, int_col
from deephaven.agg import (
sorted_first,
sorted_last,
count_distinct,
distinct,
min_,
max_,
)
x = new_table(
[double_col("X", [-0.0, 0.0, float("nan"), float("inf"), float("-inf"), None])]
).update_view(["XStr=Double.toString(X)"])
x_aggs = merge([x] * 700).agg_by(
[
sorted_first("X", ["SortedFirst=X"]),
sorted_last("X", ["SortedLast=X"]),
count_distinct(["CountDistinct=X"]),
count_distinct(["CountDistinctWithNulls=X"], count_nulls=True),
distinct(["Distinct=X"]),
distinct(["DistinctWithNulls=X"], include_nulls=True),
min_(["Min=X"]),
max_(["Max=X"]),
]
)
results in spinning computation. Resulting stacktrace:
"DeephavenApiServer-Scheduler-Serial-1" #131 [367] daemon prio=5 os_prio=0 cpu=49186.83ms elapsed=1993.30s tid=0x00007fca4c185ce0 nid=367 runnable [0x00007fcd02bfb000]
java.lang.Thread.State: RUNNABLE
at io.deephaven.engine.table.impl.ssms.DoubleSegmentedSortedMultiset.upperBound(DoubleSegmentedSortedMultiset.java:781)
at io.deephaven.engine.table.impl.ssms.DoubleSegmentedSortedMultiset.insertExistingIntoLeaf(DoubleSegmentedSortedMultiset.java:95)
at io.deephaven.engine.table.impl.ssms.DoubleSegmentedSortedMultiset.insertExisting(DoubleSegmentedSortedMultiset.java:398)
at io.deephaven.engine.table.impl.ssms.DoubleSegmentedSortedMultiset.insert(DoubleSegmentedSortedMultiset.java:439)
at io.deephaven.engine.table.impl.ssms.DoubleSegmentedSortedMultiset.insert(DoubleSegmentedSortedMultiset.java:77)
at io.deephaven.engine.table.impl.by.ssmcountdistinct.count.DoubleChunkedCountDistinctOperator.addChunk(DoubleChunkedCountDistinctOperator.java:212)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.processColumnNoKey(ChunkedOperatorAggregationHelper.java:2506)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.doNoKeyUpdate(ChunkedOperatorAggregationHelper.java:2396)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.doNoKeyAddition(ChunkedOperatorAggregationHelper.java:2233)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.noKeyAggregation(ChunkedOperatorAggregationHelper.java:2053)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.aggregation(ChunkedOperatorAggregationHelper.java:156)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.lambda$aggregation$2(ChunkedOperatorAggregationHelper.java:134)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper$$Lambda/0x00007fccac8cefc0.call(Unknown Source)
at io.deephaven.engine.table.impl.BaseTable.initializeWithSnapshot(BaseTable.java:1293)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.lambda$aggregation$3(ChunkedOperatorAggregationHelper.java:131)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper$$Lambda/0x00007fccac8ce720.get(Unknown Source)
at io.deephaven.engine.liveness.LivenessScopeStack.computeEnclosed(LivenessScopeStack.java:179)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.aggregation(ChunkedOperatorAggregationHelper.java:119)
at io.deephaven.engine.table.impl.by.ChunkedOperatorAggregationHelper.aggregation(ChunkedOperatorAggregationHelper.java:74)
at io.deephaven.engine.table.impl.QueryTable.lambda$aggNoMemo$18(QueryTable.java:845)
at io.deephaven.engine.table.impl.QueryTable$$Lambda/0x00007fccac8cd5e8.get(Unknown Source)
at io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder.withNugget(QueryPerformanceRecorder.java:369)
at io.deephaven.engine.table.impl.QueryTable.aggNoMemo(QueryTable.java:844)
at io.deephaven.engine.table.impl.QueryTable.lambda$aggBy$17(QueryTable.java:820)
at io.deephaven.engine.table.impl.QueryTable$$Lambda/0x00007fccac8c7be0.get(Unknown Source)
at io.deephaven.engine.table.impl.QueryTable$MemoizedResult.getOrCompute(QueryTable.java:3674)
- locked <0x00000000e0fa96c8> (a io.deephaven.engine.table.impl.QueryTable$MemoizedResult)
at io.deephaven.engine.table.impl.QueryTable.memoizeResult(QueryTable.java:3643)
at io.deephaven.engine.table.impl.QueryTable.aggBy(QueryTable.java:820)
at io.deephaven.engine.table.impl.QueryTable.aggBy(QueryTable.java:100)
at io.deephaven.api.TableOperationsDefaults.aggBy(TableOperationsDefaults.java:312)
Here's the conditions that cause the infinite loop:
lo = 4
hi = 5
searchValue = NaN
valuesToSearch = [-Infinity, NULL_DOUBLE, -0.0, Infinity, NaN]
mid = 4
testValue = NaN
moveHi = true
The infinite loop isn't in this inner layer; it's the combination of the inner and outer layer.
Note that valuesToSearch is not in order wrt DH comparisons; NULL_DOUBLE should be first if it were.
/**
* Insert new valuesToInsert into this SSMS. The valuesToInsert to insert must be sorted, without duplicates.
*
* The valuesToInsert and counts chunks will be modified during this call, and the resulting chunks are undefined.
*
* @param valuesToInsert the valuesToInsert to insert
* @param counts the number of times each value occurs
* @return true if any new values were inserted
*/
boolean insert(WritableChunk<? extends Values> valuesToInsert, WritableIntChunk<ChunkLengths> counts);
The valuesToInsert are being sorted in DoubleCompactKernel#compactAndCount using WritableDoubleChunk#sort(int, int), which is using java.util.Arrays#sort(double[], int, int) which does not sort according to DH sort order.
It appears even if WritableDoubleChunk#sort(int, int) is "fixed" (it may be breaking some other implicit assumption somewhere else, TBD), DoubleSegmentedSortedMultiset#insertExistingIntoLeaf can get into an infinite loop.
Likely b/c DoubleSegmentedSortedMultiset is using
private static boolean eq(double lhs, double rhs) {
// region equality function
return lhs == rhs;
// endregion equality function
}
w/ nextValue = NaN
The sort order needs to be more clearly defined wrt CompactKernel#compactAndCount and SegmentedSortedMultiSet#insert; are one or both java array sorting order, DH sorting order, or a mix?
WritableChunk explicitly declares Java's "primitive" defined ordering
/**
* Sort this chunk in-place using Java's primitive defined ordering.
* <p>
* Of note is that nulls or NaNs are not sorted according to Deephaven ordering rules.
*/
void sort();
which is not technically true (since Arrays#sort is different than primitive ordering); and furthermore, WritableCharChunk applies a "fixup".