polars icon indicating copy to clipboard operation
polars copied to clipboard

Inconsistency between code and comment

Open YichiZhang0613 opened this issue 2 months ago • 1 comments

Description

There is some inconsistencies between comments and codes. In polars-main/crates/polars-arrow/src/array/primitive/mutable.rs, I think the comment should be "Panics iff index is larger than or equal to self.len()." polars-main/crates/polars-arrow/src/array/primitive/mutable.rs

    /// # Panic
    /// Panics iff index is larger than `self.len()`.
    pub fn set(&mut self, index: usize, value: Option<T>) {
        assert!(index < self.len());
        // SAFETY:
        // we just checked bounds
        unsafe { self.set_unchecked(index, value) }
    }

In the following examples, the constraints in comment are not checked in code. For example, panics iff index >= self.len() indicates that there is a constraint index < self.len(). So the code should check it before using index directly. I don't think“out-of-bound” or "empty" crash is the behavior we expect. The ideal situation is to inform the "user" what should happen if it is out-of-bound,so I think use assert! to check whether the index is out-of-bound or a list is empty before use it and add some information when the code panics is a better way. polars-main/crates/polars-arrow/src/array/dictionary/mod.rs

   /// # Panics
    /// This function panics iff `index >= self.len()`
    #[inline]
    pub fn key_value(&self, index: usize) -> usize {
        // SAFETY: invariant of the struct
        unsafe { self.keys.values()[index].as_usize() }
    }

polars-main/crates/polars-arrow/src/bitmap/immutable.rs

   /// # Panics
    /// Panics iff `i >= self.len()`.
    #[inline]
    pub fn get_bit(&self, i: usize) -> bool {
        get_bit(&self.bytes, self.offset + i)
    }

polars-main/crates/polars-parquet/src/parquet/encoding/hybrid_rle/bitmap.rs

   /// # Panics
    /// This function panics iff `offset / 8 > slice.len()`
    #[inline]
    pub fn new(slice: &'a [u8], offset: usize, len: usize) -> Self {
        let bytes = &slice[offset / 8..];

polars-main/crates/polars-arrow/src/io/ipc/read/common.rs

   /// # Panics
    /// iff `projection` is empty
    pub fn new(projection: &'a [usize], iter: I) -> Self {
        Self {
            projection: &projection[1..],

polars-main/crates/polars-arrow/src/array/primitive/mod.rs

/// # Panic
/// This function panics iff `i >= self.len`.
#[inline]
pub fn value(&self, i: usize) -> T {
    self.values[i]
}

polars-main/crates/polars-arrow/src/bitmap/utils/mod.rs

/// # Panic
/// This function panics iff `i >= bytes.len() * 8`.
#[inline]
pub fn get_bit(bytes: &[u8], i: usize) -> bool {
    let byte = bytes[i / 8];

polars-main/crates/polars-arrow/src/io/ipc/read/file.rs

/// # Panics
/// This function panics iff `index >= metadata.blocks.len()`
#[allow(clippy::too_many_arguments)]
pub fn read_batch<R: Read + Seek>(
    reader: &mut R,
    dictionaries: &Dictionaries,
    metadata: &FileMetadata,
    projection: Option<&[usize]>,
    limit: Option<usize>,
    index: usize,
    message_scratch: &mut Vec<u8>,
    data_scratch: &mut Vec<u8>,
) -> PolarsResult<RecordBatch<Box<dyn Array>>> {
    let block = metadata.blocks[index];

YichiZhang0613 avatar Apr 30 '24 10:04 YichiZhang0613

The implicit panics of indexing v[i] are fine in my opinion. Feel free to send a PR that fixes the comments that incorrectly state they check if idx > self.len() when they should state idx >= self.len() or just 'panics if idx is out of bounds'.

orlp avatar Apr 30 '24 15:04 orlp

I found 3 more and sent pull request 3 days ago

YichiZhang0613 avatar May 04 '24 12:05 YichiZhang0613