arrow
arrow copied to clipboard
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
### Rationale for this change The order of rows in a dataset might be important for users and should be preserved when writing to a filesystem. With multi-threaded write, the...
### Describe the enhancement requested I'm looking for documentation on how to implement an ExtensionArray which supports `join` functionality. Particularly, I'd like to join a table which includes a `FixedShapeTensorArray`...
**Description** We've identified a memory leak when importing Parquet files into Pandas DataFrames using the PyArrow engine. The issue occurs specifically during the conversion from Arrow to Pandas objects, as...
### Describe the bug, including details regarding any error messages, version, and platform. I'm trying to create an HDFS Connection via `pyarrow.fs.HadoopFileSystem`, but unfortunately I get an error: ```{python} from...
### Describe the enhancement requested emsdk isn't small for `ubuntu-22.04-cpp.dockerfile`. `ubuntu-22.04-cpp.dockerfile` is shared with some images. So it should be small as much as possible. ### Component(s) C++, Continuous Integration
Currently, when writing a dataset, e.g. from a table consisting of a set of record batches, there is no guarantee that the row order is preserved when reading the dataset....
### Describe the bug, including details regarding any error messages, version, and platform. The [Windows wheel binary verification](https://github.com/ursacomputing/crossbow/actions/runs/11378238189/job/31671317057) is currently failing due to missing `PARQUET_TEST_DATA`: ``` @pytest.fixture(scope='module') def parquet_test_datadir(): if...
### Describe the bug, including details regarding any error messages, version, and platform. I encountered an issue when working with Arrow Flight SQL, and I would appreciate your help. I...
GH-44461: [Release][Packacing][Python] Set PARQUET_TEST_DATA on verify-release-candidate-wheels.bat
### Rationale for this change The Windows wheel verification fails due to missing `PARQUET_TEST_DATA` ### What changes are included in this PR? Add `PARQUET_TEST_DATA` to `verify-release-candidate-wheels.bat` which is only tested...
### Describe the enhancement requested Artifactory has 400GiB quota. So we don't want to use it for temporary artifacts as much as possible. Wheels are uploaded only for voting. We...