formats base32ct: Base32Unpadded encodes to invalid output

Ran into this issue while implementing an additional alphabet and writing tests. It appears that Encoding::encode sometimes produces output containing \0 bytes for unpadded alphabets, this then causes a panic when you try to decode it.

thread 'encode_decode_roundtrip_unpadded' panicked at 'index out of bounds: the len is 0 but the index is 0', /formats/base32ct/src/encoding.rs:105:13

Uncovered with the following proptest:

#[test]
fn encode_decode_roundtrip_unpadded(bytes in bytes_regex(".{0,256}").unwrap()) {
    let encoded = Base32UnpaddedCt::encode_string(&bytes);
    dbg!(&encoded);
    let decoded = Base32UnpaddedCt::decode_vec(&encoded);
    prop_assert_eq!(Ok(bytes), decoded);
}

The equivalent proptest for padded alphabets does not have this issue.

proptest-regressions line:

cc 9423528ad70b35a1769184980eb56ee14d6d8253e6a25d3a93fb1c6df646f3bd # shrinks to bytes = [97, 240, 144, 128, 128, 240, 144, 128, 128, 240, 144, 128, 128, 224, 160, 128, 194, 161, 240, 144, 128, 128, 240, 144, 128, 128, 240, 144, 128, 128, 65, 65, 97, 48, 65, 11, 32, 48, 240, 144, 128, 128, 65, 65, 32, 224, 160, 128, 48, 65, 97, 0, 0, 0, 32, 97, 32, 240, 144, 128, 128, 0, 65, 0, 65, 240, 144, 128, 128, 240, 144, 128, 128, 194, 161, 224, 160, 128, 0, 97, 240, 144, 128, 128, 240, 144, 128, 128, 240, 144, 128, 128, 0, 240, 144, 128, 128, 48, 32, 240, 144, 128, 128, 97, 48, 194, 161, 194, 161, 65, 65, 11, 0, 48, 0, 194, 161, 97, 240, 144, 128, 128, 194, 161, 48, 32, 240, 144, 128, 128, 224, 160, 151, 123, 13, 35, 200, 186, 243, 143, 165, 158, 242, 132, 169, 134, 239, 187, 191, 61, 47, 200, 186, 38, 37, 2, 240, 153, 129, 138, 240, 159, 149, 180, 27, 127, 107, 240, 177, 155, 189, 87, 9, 36, 239, 187, 191, 96, 8, 242, 153, 158, 162, 39, 243, 180, 141, 149, 46, 89, 121, 243, 172, 157, 150, 243, 191, 171, 146, 58, 36, 226, 128, 174, 243, 130, 174, 167, 226, 128, 174, 243, 156, 135, 184, 243, 128, 174, 129, 13, 38, 243, 157, 141, 145, 38, 0, 80, 1, 98, 235, 184, 151, 39, 9, 244, 136, 135, 159, 226, 128, 174, 241, 134, 129, 187, 58, 240, 159, 149, 180, 241, 152, 163, 168, 58, 127, 34, 36, 0, 194, 165, 123, 49, 63, 240, 151, 132, 144, 9, 243, 187, 186, 158, 46, 82, 0, 100, 34, 200, 186, 209, 168, 8, 11, 13, 195, 159, 105, 61, 4, 244, 132, 138, 147, 194, 165, 56, 241, 130, 176, 147, 241, 140, 185, 161, 241, 188, 187, 142, 209, 168, 76, 11, 80, 60, 240, 151, 184, 135, 47, 52, 7, 95, 195, 166, 62, 194, 165, 9, 241, 165, 173, 133, 241, 136, 169, 149, 74, 241, 131, 134, 153, 91, 39, 239, 191, 189, 244, 137, 173, 187, 92, 123, 95, 92, 123, 0, 209, 168, 71, 7, 62, 82, 240, 159, 149, 180, 9, 241, 144, 152, 167, 77, 77, 242, 131, 133, 177, 34, 240, 187, 152, 184, 48, 46, 54, 243, 170, 171, 183, 37, 86, 243, 182, 138, 135, 240, 169, 145, 166, 123, 241, 180, 170, 130, 240, 187, 143, 158, 77, 58, 126, 36, 242, 148, 133, 176, 240, 159, 149, 180, 242, 133, 171, 132, 27, 241, 173, 161, 129, 242, 145, 137, 176, 83, 105, 243, 183, 178, 154, 46, 92, 241, 135, 167, 134, 232, 173, 185, 34, 115, 36, 11, 38]

Mar 21 '23 17:03 Kiskae

Reduced to this testcase:

#[test]
fn decode_convergent_length() {
    assert!(Base32Unpadded::decode_vec(&"a".repeat(729)).is_ok());
}

It turns out that at length 729, the decoded_len function returns a multiple of 5. This then causes dst.chunks_exact_mut(5).into_remainder() in Encoded::decode to return an empty slice since there is no remainder.

An input of length728 also has this property, but since it is also divisible by 8 it skips all the places where it tries to write to dst_rem.

This bug is also triggered by an input with length 1 making it use a dst buffer of size 0:

#[test]
fn decode_too_short_error() {
    assert_eq!(
        Base32Unpadded::decode_vec("a"),
        Err(Error::InvalidEncoding)
    );
}

Mar 22 '23 20:03 Kiskae

Seems like a legitimate bug. I'll try to take a look this weekend.

Mar 24 '23 15:03 tarcieri

The title of this issue appears to be a third bug, where encode does not properly truncate the unused parts of the output buffer.

Mar 24 '23 15:03 Kiskae

formats formats copied to clipboard

base32ct: Base32Unpadded encodes to invalid output

formats
formats copied to clipboard