hexl icon indicating copy to clipboard operation
hexl copied to clipboard

Issue in HEXL_CHECK_BOUNDS debug check in avx512 reduction when output_mod_factor is 2.

Open jcalafato1 opened this issue 2 years ago • 1 comments

Hello,

I believe there is an issue in the check found at: hexl/eltwise/eltwise-reduce-mod-avx512.hpp:84. The code reads as follows:

if (input_mod_factor == modulus) { if (output_mod_factor == 2) { for (size_t i = 0; i < n_tmp; i += 8) { __m512i v_op = _mm512_loadu_si512(v_operand); v_op = _mm512_hexl_barrett_reduce64<BitShift, 2>( v_op, v_modulus, v_bf, v_bf_52, prod_right_shift, v_neg_mod); HEXL_CHECK_BOUNDS(ExtractValues(v_op).data(), 8, modulus, "v_op exceeds bound " << modulus); _mm512_storeu_si512(v_result, v_op); ++v_operand; ++v_result; } ...

This check will fail even though the operations succeeds. I believe you should be checking against modulus << 1u instead of against modulus, since the output is expected to be within [0, 2*modulus). I'm not sure if there are other problems across the codebase with these checks, but it is difficult to debug issues when these things are throwing.

jcalafato1 avatar Nov 30 '23 16:11 jcalafato1

Hello @jcalafato1, Sorry, I guess you already found a way around this issue. This does look like a bug as the output_mod_factor in this case is 2. However, do you have an input example that actually give you a result over the modulus value, [0, 2*modulus)?

joserochh avatar Jul 03 '24 23:07 joserochh

Hello @jcalafato1, Sorry, I guess you already found a way around this issue. This does look like a bug as the output_mod_factor in this case is 2. However, do you have an input example that actually give you a result over the modulus value, [0, 2*modulus)?

When input_mod_factor == modulus, it is almost certain the result will exceed modulus. Here's my example: operand = [18399319504785536384, 17772833711639413686, 12597119745262224203, 1504294004559805751, 11357185129558358846, 15524763729212309524, 15578066193709346988, 9262080163435001663], modulus = 1099511590913, the result (v_op) is [183473757692, 701667589612, 1154521301334, 519986957540, 1153052298859, 914113932554, 1255706689604, 130251390394]

ppppbamzy avatar Oct 24 '24 07:10 ppppbamzy

I don't know if the output will ever be greater than 2 * modulus, but this bug means that you can't use the bounds checks to debug other things, so the check in code needs to be updated for this case.

jcalafato1 avatar Oct 24 '24 13:10 jcalafato1

Closed - fixed by PR #153

faberga avatar May 13 '25 17:05 faberga