Ben McDonald issues

Results 21 issues of


                                            Ben McDonald

Improve Makefile to set `CHPL_HOME` if unset

In a [comment from Brad ](https://github.com/Bears-R-Us/arkouda/pull/2841#issuecomment-1793275991), it was suggested that having the `Makefile` set `CHPL_HOME` if it is unset using `chpl --print-chpl-home` would be an improvement to the build process,...

enhancement

Writing fixed sized Parquet files

Parquet files today are written with 1 file-per-locale and the reason for that is because the interface was based off of the HDF5 interface, where we are writing 1-file-per-locale so...

Closes #3219: Optimize old Parquet srting read code

After seeing more results gathered on different machines showing mixed results for the new Parquet string optimization, we have decided to make some changes and go back towards a simpler...

Optimize old Parquet srting read code

After seeing more results gathered on different machines showing mixed results for the new Parquet string optimization, we have decided to make some changes and go back towards a simpler...

Reduce code duplication in Parquet read code with templates

The Parquet code previously had been duplicating blocks of code with only changing the types. To reduce this duplication to improve readability and maintainability, template functions are being added in...

Closes #3341: Switch Parquet default compression to zstd

Some experiments have shown that zstd provides the optimal compression size as well as write/read times, so the decision was made to switch the Parquet default to zstd compression. Closes...

Switch Parquet default compression to zstd

Some experiments have shown that zstd provides the optimal compression size as well as write/read times, so the decision was made to switch the Parquet default to zstd compression.

`can_test` breaking behavior change in NumPy 2.0

`can_cast` behavior changed in NumPy 2.0, which is causing test failures when using NumPy 2.0. I think we'll have to perhaps roll our own `can_cast` to emulate the old behavior.

Fix can_cast with NumPy 2.0 breaking changes

In NumPy 2.0, the behavior of `can_cast` changed to no longer on values, only types, so by querying the dtype of the pdarray, we can work with NumPy 2.0.

Calculate the byte sizes of Parquet str columns in batches

To improve the performance of Parquet string reading, increasing the batch size for byte calculation allows for larger chunks of work, leading to a performance improvement.