datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Incorrect statistics read for unsigned integer columns in parquet

Open NGA-TRAN opened this issue 9 months ago • 1 comments

Describe the bug

I found this bug while adding tests for reading parquet statistics https://github.com/apache/datafusion/pull/10592/. Instead of getting corresponding UInt8Array, UInt16Array, UInt32Array for columns with u8, u16, u32 data types, we get Int32Array. Similarly, instead of getting UInt64Array for column with data type u64, we get Int64Array.

To Reproduce

See the test test_uint in PR https://github.com/apache/datafusion/pull/10592 (will be merged soon)

Expected behavior

UInt8Array, UInt16Array, UInt32Array, UInt64Array for columns with data types u8, u16, u32 and u64 respectively

Additional context

No response

NGA-TRAN avatar May 21 '24 15:05 NGA-TRAN