polars icon indicating copy to clipboard operation
polars copied to clipboard

expose utility methods for creating Arrays from slices.

Open universalmind303 opened this issue 11 months ago • 2 comments

Description

given this example:

use polars_arrow::{
    array::Utf8Array, bitmap::Bitmap, buffer::Buffer, datatypes::ArrowDataType,
    offset::OffsetsBuffer,
};

// should print "[Some('hello'), None, Some('world')]"
fn main() {
    let offsets: &[i64] = &[0, 5, 5, 10];
    let values: &[u8] = &[104, 101, 108, 108, 111, 119, 111, 114, 108, 100];
    let null_bitmap: &[u8] = &[5];

    let utf8_array = Utf8Array::try_new(
        ArrowDataType::LargeUtf8, // i64 indices -> LargeUtf8; i32 -> Utf8
        OffsetsBuffer::try_from(offsets.to_owned())?,
        Buffer::from(values.to_owned()),
        Some(Bitmap::try_new(null_bitmap.to_owned(), 3).unwrap()),
    ).unwrap();

    println!("{utf8_array:?}");
    println!("{:?}", utf8_array.iter().collect::<Vec<_>>());

    Ok(())
}

It is very unclear (as it requires on from/try_from), and it requires copying the slices.

I'd prefer some way to create an array from the borrowed values instead.

arrow-rs provides some nice utilities for this

let array_data = ArrayData::builder(DataType::LargeUtf8)
    .len(3)
    .null_count(1)
    .add_buffer(Buffer::from_slice_ref(offsets))
    .add_buffer(Buffer::from_slice_ref(values))
    .null_bit_buffer(Some(Buffer::from_slice_ref(null_bitmap)))
    .build()
    .unwrap();
let string_array = LargeStringArray::from(array_data);

So something like the following would be preferred:


    let utf8_array = Utf8Array::try_new(
        ArrowDataType::LargeUtf8, 
        OffsetsBuffer::from_slice(offsets)?,
        Buffer::from_slice(values)?,
        Some(Bitmap::from_slice(null_bitmap, 3)),
    )?;

Additional context

https://stackoverflow.com/questions/78181786/how-to-create-a-polars-arrow-array-from-raw-values-u8/78182192#78182192

universalmind303 avatar Mar 18 '24 18:03 universalmind303