infisearch icon indicating copy to clipboard operation
infisearch copied to clipboard

Multiple Possible Memory Bugs in infisearch

Open ydongyeon opened this issue 4 months ago • 0 comments

Summary

(1) heap-buffer-overflow An unsafe indexing vulnerability in the get_num_val method of the DocInfo struct allows arbitrary memory access when the provided index exceeds the buffer length, potentially triggering undefined behavior. There is a debug_assert function that tries to prevent possible memory bugs, however, debug_assert function is implemented incorrectly.

(2) global-buffer-overflow A read_type function allows arbitrary memory access when the provided index exceeds the buffer length, potentially triggering undefined behavior.

Details

Hi,

First, I want to extend my gratitude for maintaining this excellent crate. I’ve identified some potential security vulnerabilities: Heap Buffer Overflow, Global Buffer Overflow.

Environment:

  • OS: Ubuntu 22.04.4 LTS
  • Rust Compiler: rustc 1.81.0 (eeb90cda1 2024-09-04)
  • Cargo: cargo 1.81.0 (2dbb1af80 2024-08-20)
  • infisearch: Latest version (commit cb68c37322e1b6f57ad53357b0e2a3e191fd15a2, 2023-05-16)

Steps to reproduce Heap Buffer Overflow:

(1) Replace infisearch/packages/infisearch_search/src/doc_info.rs with the modified doc_info.rs file as below.


... // existing codes

// #[inline(always)]
pub fn get_num_val(&self, doc_id: usize, num_id: usize) -> i64 {
    debug_assert!(((doc_id * self.num_i64_fields) + num_id) < self.doc_enum_vals.len());

    unsafe {
        *self.doc_i64_vals.get_unchecked((doc_id * self.num_i64_fields) + num_id)
    }
}


#[cfg(test)]
mod tests {
    use super::*; 

    #[test]
    fn test_doc() {
        println!("hello");
        let doc_length_factors = vec![1.0, 2.5, 3.75];
        let doc_length_factors_len = doc_length_factors.len() as u32;

        let doc_enum_vals = vec![0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16];

        let doc_i64_vals = vec![100, 200, 300, 400];

        let num_docs = 4;
        let num_fields = 10;
        let num_enum_fields = 3;
        let num_i64_fields = 4;

        let doc_info = DocInfo {
            doc_length_factors,
            doc_length_factors_len,
            doc_enum_vals,
            doc_i64_vals,
            num_docs,
            num_fields,
            num_enum_fields,
            num_i64_fields,
        };

        let b = doc_info.get_num_val(1, 2); // (1 * 4) + 2 = 6, doc_i64_vals.len() = 4
        println!("{:?}", b);
    }
}

(2) Run the test using the ASan flag.

RUSTFLAGS="-Zsanitizer=address -C opt-level=2" cargo +nightly test test_doc --target x86_64-unknown-linux-gnu

Details: // infisearch/packages/infisearch_search/src/doc_info.rs

pub fn get_num_val(&self, doc_id: usize, num_id: usize) -> i64 {
    debug_assert!(((doc_id * self.num_i64_fields) + num_id) < self.doc_enum_vals.len());

    unsafe {
        *self.doc_i64_vals.get_unchecked((doc_id * self.num_i64_fields) + num_id)
    }
}

Logical Error: The debug_assert! is incorrectly comparing the index against self.doc_enum_vals.len() instead of self.doc_i64_vals.len(). This means that the assertion might pass even when the index is out of bounds for doc_i64_vals. The method uses get_unchecked, which performs an unsafe access without bounds checking. If the index is out of bounds, it leads to undefined behavior.

Actual results : running 1 test ==1349695==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x503000001960 at pc 0x5555556d1add bp 0x7ffff4afe490 sp 0x7ffff4afe488 READ of size 8 at 0x503000001960 thread T1 #0 0x5555556d1adc in infisearch_search::doc_info::DocInfo::get_num_val::h6b752db723cfe24b /home/dy3199/Fuzzing-Test/infisearch/packages/infisearch_search/src/doc_info.rs:78:13 #1 0x5555556d1adc in infisearch_search::doc_info::tests::test_doc::hfe8dc957d287c736 /home/dy3199/Fuzzing-Test/infisearch/packages/infisearch_search/src/doc_info.rs:113:17 #2 0x555555729408 in infisearch_search::doc_info::tests::test_doc::_$u7b$$u7b$closure$u7d$$u7d$::h188f5cc4c78c2e4c /home/dy3199/Fuzzing-Test/infisearch/packages/infisearch_search/src/doc_info.rs:88:18 #3 0x555555729408 in core::ops::function::FnOnce::call_once::h261de6e9d2e86fbb /home/dy3199/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5

...

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/dy3199/Fuzzing-Test/infisearch/packages/infisearch_search/src/doc_info.rs:78:13 in infisearch_search::doc_info::DocInfo::get_num_val::h6b752db723cfe24b Shadow bytes around the buggy address: 0x503000001680: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001780: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001800: fa fa fd fd fd fd fa fa fd fd fd fd fa fa 00 00 0x503000001880: 00 00 fa fa 00 00 00 00 fa fa 00 00 00 fa fa fa =>0x503000001900: 00 00 01 fa fa fa 00 00 00 00 fa fa[fa]fa fa fa 0x503000001980: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x503000001b80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==287389==ABORTING

Screenshot 2024-10-13 at 9 41 46 PM

Recommended Patch: Given the potential memory safety issues, I would suggest: Modifying assert function in get_num_val with proper value.

pub fn get_num_val(&self, doc_id: usize, num_id: usize) -> i64 {
    debug_assert!(((doc_id * self.num_i64_fields) + num_id) < self.doc_i64_vals.len());

    unsafe {
        *self.doc_i64_vals.get_unchecked((doc_id * self.num_i64_fields) + num_id)
    }
}

Steps to reproduce Global Buffer Overflow:

(1) Replace infisearch/packages/infisearch_common/src/packed_var_int.rs with the modified packed_var_int.rs file as below.

#[cfg(test)]
mod test {
    use super::read_bits_from;

    #[test]
    fn test_bitread() {
        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 5, &[0, 0, 0]);
        assert!(bit_pos == 5);
        assert!(result == 0);

        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 15, &[0, 0, 0]);
        assert!(bit_pos == 15);
        assert!(result == 0);

        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 8, &[255, 0, 0]);
        assert!(bit_pos == 8);
        assert!(result == 255);
        
        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 4, &[255, 0, 0]);
        assert!(bit_pos == 4);
        assert!(result == 15);
        
        let mut bit_pos = 11;
        let result = read_bits_from(&mut bit_pos, 4, &[0, 255, 0]);
        assert!(bit_pos == 15);
        assert!(result == 15);

        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 11, &[255, 0b0010_0000, 0]);
        assert!(bit_pos == 11);
        assert!(result == 0b1111_1111_001);

        let mut bit_pos = 0;
        let result = read_bits_from(&mut bit_pos, 24, &[255, 0b0010_0000, 0]);
        assert!(bit_pos == 24);
        assert!(result == 0b1111_1111_0010_0000_0000_0000);

        let mut bit_pos = 4;
        let result = read_bits_from(&mut bit_pos, 11, &[255, 0b0010_0000, 0]);
        assert!(bit_pos == 15);
        assert!(result == 0b1111_0010_000);

        let mut bit_pos = 4;
        let result = read_bits_from(&mut bit_pos, 20, &[255, 0b0010_0000, 0]);
        assert!(bit_pos == 24);
        assert!(result == 0b1111_0010_0000_0000_0000);
    }

    use super::{PackedVarIntReader};

    #[test]
    fn test_packed_varint_reader() {
        const NUM_TYPES: usize = 2;

        let max_bit_lens: [usize; NUM_TYPES] = [5, 8];
        let type_chunk_max_idxes: [u8; NUM_TYPES] = [3, 4];

        let bits_as_bytes: &[u8] = &[0b00101011, 0b11010010, 0b11110000];

        let mut reader = PackedVarIntReader::new(bits_as_bytes, max_bit_lens, type_chunk_max_idxes);

        // Read integers of type 0
        let value1 = reader.read_type(3);
        println!("Type 0, Value 1: {}", value1);

        // let value2 = reader.read_type(0);
        // println!("Type 0, Value 2: {}", value2);

        // Read integers of type 1
        let value4 = reader.read_type(1);
        println!("Type 1, Value 1: {}", value4);



    }


}

(2) Run the test using the ASan flag.

RUSTFLAGS="-Zsanitizer=address" cargo +nightly test test_packed_varint_reader --target x86_64-unknown-linux-gnu --release

Details: // infisearch/packages/infisearch_common/src/packed_var_int.rs

pub fn read_type(&mut self, t: usize) -> u32 {
    // Guarantees: t < NUM_TYPES at compile time

    if unsafe { *self.type_chunk_idxes.get_unchecked(t) } == 0 {
        // Read current chunk's bit length
        *unsafe { self.type_chunk_lens.get_unchecked_mut(t) } = read_bits_from(
            &mut self.bit_pos,
            unsafe { *self.max_bit_lens.get_unchecked(t) },
            &self.bits_as_bytes,
        ) as usize + 1; // was -1 ed when encoding to allow encoding 1 more power, reverse it
    }

    unsafe {
        *self.type_chunk_idxes.get_unchecked_mut(t) =
            ((*self.type_chunk_idxes.get_unchecked(t)) + 1)
            % (*self.type_chunk_max_idxes.get_unchecked(t));
    }

    read_bits_from(
        &mut self.bit_pos,
        unsafe { *self.type_chunk_lens.get_unchecked(t) },
        &self.bits_as_bytes,
    )
}

In this case, the read_type method uses the unsafe keyword to access memory without performing bounds checking. Specifically, it directly dereferences a pointer offset by index t from the type_chunk_idxes. This approach violates Rust’s memory safety guarantees, as it can lead to invalid memory access if index t exceeds the intended buffer length (type_chunk_idxes.len()).

Actual results : ==1353076==ERROR: AddressSanitizer: global-buffer-overflow on address 0x555555582a83 at pc 0x55555567daed bp 0x7ffff4cfe3b0 sp 0x7ffff4cfe3a8 READ of size 1 at 0x555555582a83 thread T1 #0 0x55555567daec in infisearch_common::packed_var_int::PackedVarIntReader$LT$$GT$::read_type::h7e051bf06f802e10 infisearch_common.8faa01ae863d3bd8-cgu.0 #1 0x555555673a2b in core::ops::function::FnOnce::call_once::he16019071797838b infisearch_common.8faa01ae863d3bd8-cgu.0 #2 0x5555556f55da in test::__rust_begin_short_backtrace::h0db03bcef8350635 infisearch_common.8faa01ae863d3bd8-cgu.0 #3 0x5555556f2180 in test::run_test::$u7b$$u7b$closure$u7d$$u7d$::he4cb7f7454d67ec7 infisearch_common.8faa01ae863d3bd8-cgu.0 #4 0x5555556f591b in std::sys::backtrace::__rust_begin_short_backtrace::hd1596cbf522e6291 infisearch_common.8faa01ae863d3bd8-cgu.0 ...

Screenshot 2024-10-13 at 10 11 44 PM

Recommended Patch: Given the potential memory safety issues, I would suggest: Implementing input validation in read_type to safely handle unexpected index t value.

pub fn read_type(&mut self, t: usize) -> u32 {
    assert!(t < type_chunk_idxes.length());

    if unsafe { *self.type_chunk_idxes.get_unchecked(t) } == 0 {
        // Read current chunk's bit length
        *unsafe { self.type_chunk_lens.get_unchecked_mut(t) } = read_bits_from(
            &mut self.bit_pos,
            unsafe { *self.max_bit_lens.get_unchecked(t) },
            &self.bits_as_bytes,
        ) as usize + 1; // was -1 ed when encoding to allow encoding 1 more power, reverse it
    }
...

Impact

The get_num_val and read_type methods of the infisearch library introduce unsafe indexing vulnerabilities by allowing arbitrary memory access without bounds checking. These flaws can lead to undefined behavior.

Although these bugs may be difficult to exploit in practice, I understand that the Rust community reports such issues to RUSTSEC in an effort to further enhance memory safety. Rust considers these issues critical to memory safety, regardless of whether they have been exploited, and takes proactive measures to either fix or report them. I have included references to similar cases below for your consideration. Therefore, while the potential for exploitation may be low, I believe that eliminating potential memory unsafety in unsafe regions can contribute to strengthening Rust's memory safety.

Panic on overflow in subtraction RUSTSEC-2023-0078 RUSTSEC-2022-0078 RUSTSEC-2022-0012

ydongyeon avatar Oct 13 '24 13:10 ydongyeon