asfimport
asfimport
See https://github.com/apache/arrow/pull/4498/files for reference. **Reporter**: [Micah Kornfield](https://issues.apache.org/jira/browse/ARROW-5550) / @emkornfield **Note**: *This issue was originally created as [ARROW-5550](https://issues.apache.org/jira/browse/ARROW-5550). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*
I have JSON data where the columnar (line-delimited) part is in a `data` subkey: ```java { "metadata": {"name": "block1"}, "data" : [ {"a": 1, "b": 2.0, "c": "foo", "d": false},...
See https://github.com/apache/arrow/pull/4293#issuecomment-501950675 **Reporter**: [Francois Saint-Jacques](https://issues.apache.org/jira/browse/ARROW-5611) / @fsaintjacques **Note**: *This issue was originally created as [ARROW-5611](https://issues.apache.org/jira/browse/ARROW-5611). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*
The expression cache in gandiva generates uses the ToString() method of arrow::DataType() for both hashing and equality. This is error-prone - we should have a visitor for generating hash, and...
Description is by @kou: I want to use GitLab Runner instead of CircleCI. Because we can add custom GitLab Runners for us. For example, we can add GPU enabled GitLab...
Followup to ARROW-5178. See discussion on . **Reporter**: [Neal Richardson](https://issues.apache.org/jira/browse/ARROW-5761) / @nealrichardson **Note**: *This issue was originally created as [ARROW-5761](https://issues.apache.org/jira/browse/ARROW-5761). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*
I'm trying to read CSV as is. All columns as strings. I don't know the schema of these CSVs and they will vary as they are provided by user. Right...
`ParquetManifest._visit_directories` uses a `ThreadPoolExecutor` to visit partitioned parquet datasets concurrently, it waits for them to finish but doesn't check if the respective futures have failed or not. This is quite...
From a comment from @wesm in ARROW-2714: > The Tensor classes are independent from the columnar data structures, though they reuse pieces of metadata, metadata serialization, memory management, and IPC....
Since Arrow (and pyarrow) have several independent optional component, instead of installing all of them it would be convenient if these could be opt-in from pip like `pip install pyarrow[gandiva,flight,plasma]`...