velox icon indicating copy to clipboard operation
velox copied to clipboard

Unmatched result of subscript() function

Open kagamiori opened this issue 2 years ago • 1 comments

======> Started iteration 0 (seed: 1088200697)
I1008 09:18:51.890049 1103505 ExpressionFuzzer.cpp:798] Executing expression: subscript("c0","c1")
I1008 09:18:51.890246 1103505 ExpressionFuzzer.cpp:801] 2 vectors as input:
I1008 09:18:51.890280 1103505 ExpressionFuzzer.cpp:803] 	[DICTIONARY MAP<TINYINT,VARCHAR>: 100 elements, 6 nulls], [CONSTANT MAP<TINYINT,VARCHAR>: 100 elements, 10 elements starting at 30 {[30->322] [322->152] -124 => 5uumaOR$[`;FS5#UoV'*-mg3Lq}5wiCByLJ(;NWX9S&X2Z5C$8Vy0EN`I09;Z1n:JcF(OxpQ`:x', [31->324] [324->163] 81 => _#,5/-0oqN(BdZ,Z\$8Er%N:, [32->312] [312->336] -59 => "&'wTdx_VQ$w(?g=j6J_R*Y].A9I_-z:T-3n1>s6u>rM\>L^URGFkryhk:~z, [33->194] [194->92] null => ?Nq", [34->50] [50->163] 81 => ~p/{g:7yw#b9%Qo$sli>J{fea;E4|q$pf~S`NjT``"sQ2B(<im>{)P]o:#SP, ...}], [MAP MAP<TINYINT,VARCHAR>: 35 elements, 2 nulls]
I1008 09:18:51.890678 1103505 ExpressionFuzzer.cpp:803] 	[FLAT TINYINT: 100 elements, 9 nulls]
E1008 09:18:51.894825 1103505 Exceptions.h:68] Line: velox/expression/tests/ExpressionFuzzer.cpp:179, Function:compareVectors, Expression: vec1->equalValueAt(vec2.get(), i, i) Different results at idx '36': 'null' vs. '[36->353] ?Nq"', Source: RUNTIME, ErrorCode: INVALID_STATE
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Different results at idx '36': 'null' vs. '[36->353] ?Nq"'
Retriable: False
Expression: vec1->equalValueAt(vec2.get(), i, i)
Function: compareVectors
File: velox/expression/tests/ExpressionFuzzer.cpp
Line: 179
Stack trace:
...

kagamiori avatar Oct 08 '22 16:10 kagamiori

This problem is caused by Fuzzer generating a map with null keys and passing it to subscript function, which doesn't expect null map keys. The following loop doesn't check for nulls and produces non-deterministic results. Fuzzer also can generate maps with duplicate keys, in which case subscript function will produce non-deterministic results as well.

      for (size_t offset = offsetStart; offset < offsetEnd; ++offset) {        
        if (decodedMapKeys->valueAt<TKey>(offset) == searchKey) {
          rawIndices[row] = offset;
          found = true;
          break;
        }
      }

CC: @pedroerp

I think proper solution is to change Fuzzer to not generate maps with null or duplicate keys.

mbasmanova avatar Oct 26 '22 21:10 mbasmanova

This problem is caused by Fuzzer generating a map with null keys and passing it to subscript function, which doesn't expect null map keys. The following loop doesn't check for nulls and produces non-deterministic results. Fuzzer also can generate maps with duplicate keys, in which case subscript function will produce non-deterministic results as well.

      for (size_t offset = offsetStart; offset < offsetEnd; ++offset) {        
        if (decodedMapKeys->valueAt<TKey>(offset) == searchKey) {
          rawIndices[row] = offset;
          found = true;
          break;
        }
      }

CC: @pedroerp

I think proper solution is to change Fuzzer to not generate maps with null or duplicate keys.

Hi @mbasmanova, Thank you for looking into this bug. Makes sense to me. I think we can close this issue since https://github.com/facebookincubator/velox/issues/2848 tracks the problem of generating maps with unique and non-null keys.

kagamiori avatar Oct 26 '22 21:10 kagamiori

This one should have been solved by @pedroerp update.

laithsakka avatar Oct 27 '22 15:10 laithsakka

Closing per comments.

mbasmanova avatar Oct 28 '22 15:10 mbasmanova