Issue in HEXL_CHECK_BOUNDS debug check in avx512 reduction when output_mod_factor is 2.
Hello,
I believe there is an issue in the check found at: hexl/eltwise/eltwise-reduce-mod-avx512.hpp:84. The code reads as follows:
if (input_mod_factor == modulus) { if (output_mod_factor == 2) { for (size_t i = 0; i < n_tmp; i += 8) { __m512i v_op = _mm512_loadu_si512(v_operand); v_op = _mm512_hexl_barrett_reduce64<BitShift, 2>( v_op, v_modulus, v_bf, v_bf_52, prod_right_shift, v_neg_mod); HEXL_CHECK_BOUNDS(ExtractValues(v_op).data(), 8, modulus, "v_op exceeds bound " << modulus); _mm512_storeu_si512(v_result, v_op); ++v_operand; ++v_result; } ...
This check will fail even though the operations succeeds. I believe you should be checking against modulus << 1u instead of against modulus, since the output is expected to be within [0, 2*modulus). I'm not sure if there are other problems across the codebase with these checks, but it is difficult to debug issues when these things are throwing.
Hello @jcalafato1, Sorry, I guess you already found a way around this issue. This does look like a bug as the output_mod_factor in this case is 2. However, do you have an input example that actually give you a result over the modulus value, [0, 2*modulus)?
Hello @jcalafato1, Sorry, I guess you already found a way around this issue. This does look like a bug as the output_mod_factor in this case is 2. However, do you have an input example that actually give you a result over the modulus value, [0, 2*modulus)?
When input_mod_factor == modulus, it is almost certain the result will exceed modulus. Here's my example:
operand = [18399319504785536384, 17772833711639413686, 12597119745262224203, 1504294004559805751, 11357185129558358846, 15524763729212309524, 15578066193709346988, 9262080163435001663],
modulus = 1099511590913,
the result (v_op) is [183473757692, 701667589612, 1154521301334, 519986957540, 1153052298859, 914113932554, 1255706689604, 130251390394]
I don't know if the output will ever be greater than 2 * modulus, but this bug means that you can't use the bounds checks to debug other things, so the check in code needs to be updated for this case.
Closed - fixed by PR #153