Control-flow reconstruction chokes on libcrux-ml-kem's rej_sample
CC @franziskuskiefer and @karthikbhargavan
This piece of code:
#[inline(always)]
pub(crate) fn rej_sample(a: &[u8], result: &mut [i16]) -> usize {
let mut sampled = 0;
for bytes in a.chunks(3) {
let b1 = bytes[0] as i16;
let b2 = bytes[1] as i16;
let b3 = bytes[2] as i16;
let d1 = ((b2 & 0xF) << 8) | b1;
let d2 = (b3 << 4) | (b2 >> 4);
if d1 < FIELD_MODULUS && sampled < 16 {
result[sampled] = d1;
sampled += 1
}
if d2 < FIELD_MODULUS && sampled < 16 {
result[sampled] = d2;
sampled += 1
}
}
sampled
}
found in libcrux/libcrux-ml-kem/src/vector/neon.rs (bottom of the file), seems to cause a problem with control-flow reconstruction. My understanding is that Charon ought to reconstruct the if-blocks. Unfortunately, what I get in LLBC is very far from the control-flow in the source code.
I won't paste the LLBC here since it is incredibly verbose, but essentially, what I receive in Eurydice is this, for the two if-blocks (everything up to let d2 is translated properly):
let mutable uu____12043(Mark.Present,(Mark.AtMost 4), ): bool = $any
in
let mutable uu____12044(Mark.Present,(Mark.AtMost 4), ): int16_t =
$any
in
let mutable uu____12045(Mark.Present,(Mark.AtMost 4), ): bool = $any
in
let mutable uu____12046(Mark.Present,(Mark.AtMost 4), ): size_t = $any
in
let mutable uu____12047(Mark.Present,(Mark.AtMost 4), ): int16_t =
$any
in
let mutable uu____12048(Mark.Present,(Mark.AtMost 4), ): size_t = $any
in
let mutable uu____12053(Mark.Present,(Mark.AtMost 4), ): int16_t =
$any
in
if
(<,int16_t) @8 libcrux_ml_kem.vector.traits.FIELD_MODULUS
then
if
(<,size_t) @15 16size_t
then
(&(Eurydice.slice_index < int16_t > □ @16 @15):
int16_t*)[0uint32_t]
:= @8;
@15 := (+,size_t) @15 1size_t;
@5 := @7;
@0 := libcrux_ml_kem.vector.traits.FIELD_MODULUS;
@6 := (<,int16_t) @5 @0;
if
@6
then
@3 := @15;
@4 := (<,size_t) @3 16size_t;
if
@4
then
@2 := @7;
@1 := @15;
(&(Eurydice.slice_index < int16_t > □ @16 @1):
int16_t*)[0uint32_t]
:= @2;
@15 := (+,size_t) @15 1size_t;
continue
else ()
else ();
continue
else ()
else ();
@5 := @7;
@0 := libcrux_ml_kem.vector.traits.FIELD_MODULUS;
@6 := (<,int16_t) @5 @0;
if
@6
then
@3 := @15;
@4 := (<,size_t) @3 16size_t;
if
@4
then
@2 := @7;
@1 := @15;
(&(Eurydice.slice_index < int16_t > □ @16 @1):
int16_t*)[0uint32_t]
:= @2;
@15 := (+,size_t) @15 1size_t;
continue
else ()
else ()
Ignoring the syntax, notice how there are four nested if-blocks, and notice how the second if-block does not appear alongside the first.
This would be a great code-quality improvement. Thank you.
I can reproduce; here's a simpler example:
fn rej_sample(a: &[u8]) -> usize {
let mut sampled = 0;
if a[0] < 42 && a[1] < 16 {
sampled += 100;
}
sampled += 101; // This statement is duplicated.
sampled
}
The resulting llbc is roughly:
fn rej_sample(a: &[u8]) -> usize {
let mut sampled = 0;
if a[0] < 42 {
if a[1] < 16 {
sampled += 100;
sampled += 101; // This statement is duplicated.
return sampled;
}
}
sampled += 101; // This statement is duplicated.
sampled
}
Excellent. Thanks for minimizing!
Thanks for minimizing!