fdb-record-layer
fdb-record-layer copied to clipboard
Scanning an aggregate index can result in an empty, non-end continuation
If you have an ungrouped aggregate index (though I think this also works on a grouped aggregate index if your scan boundary aligns with the group), then when the aggregate index is scanned, we can end up creating a malformed continuation. In particular, the underlying scan will ultimately be based on a KeyValueBase.Continuation. This continuation is based off of the final key with a prefix removed that is common to all elements in the scan range. (That way, it doesn't have to be repeated with continuation.) For an ungrouped aggregate index, there's only a single key, and that key is equal to the scan range prefix. That means that the serialized continuation will be the empty string, although that doesn't mean the end in this case because this will be associated with the value from the aggregate scan.
So, for example, you might have a value/continuation pair like:
res1: value = 100, continuation = []
res2: no_next_reason=EXHAUSTED, continuation = []
Which is bad, especially as a reader who is doing something like reading one value at a time and then resuming the scan from a continuation can get into an infinite loop if the empty continuation is interpreted as a "start" continuation.
We noticed this in some of our yaml tests, as they do not expect to get a "begin" continuation back from running a query, which can happen if one of these lives in the plan stack. For example, a plan like:
AISCAN(T2_I1 <,> BY_GROUP -> [_0: VALUE:[0]]) | MAP (_ AS _0) | ON EMPTY NULL | MAP (coalesce_long(_._0._0, promote(0l AS LONG)) AS _0)
(A plausible plan for a query like select count(*) from t2 with an ungrouped count index T2_I1--see: aggregate-empty-table.yamsql.) If the underlying scan returns a single result, then the map plan passes the continuation along, and the ON EMPTY plan's OrElseCursor returns an empty continuation as well, though it is not marked as an END continuation. This results in machinery in the Relational Layer eventually producing a new BEGINNING continuation, which can result in infinite looping.