asfimport

Results 328 issues of asfimport

ParquetMR contains a suite of self-tests. When one of those self-tests fails, it would be nice to be able to pull up the test in an IDE like IntelliJ. Then...

Component: Parquet
Component: Testing
Priority: Blocker
Type: bug

We have a lack of proper integration tests between components. Fortunately, we already have a git repository to upload test data: https://github.com/apache/parquet-testing. The idea is the following. Create a directory...

Component: Parquet
Component: Testing
Priority: Major
Type: test

@pmouawad ([Bug 63456](https://bz.apache.org/bugzilla//show_bug.cgi?id=63456&redirect=false)): Hello, This could be a good start for future HTTP2 support. Regards OS: All

enhancement
os: All
P2

For consistency with S3FileSystem and others. See discussion at https://github.com/apache/arrow/pull/13404#discussion_r901799543 **Reporter**: [Neal Richardson](https://issues.apache.org/jira/browse/ARROW-16884) / @nealrichardson **Note**: *This issue was originally created as [ARROW-16884](https://issues.apache.org/jira/browse/ARROW-16884). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further...

Type: enhancement
Component: C++
good-first-issue

It's not possible to open a ``abfs://`` or `abfss://` URI with the pyarrow.fs.HadoopFileSystem. Using HadoopFileSystem.from_uri(path) does not work and libhdfs will throw an error saying that the authority is invalid...

Type: bug
Component: Python

Currently, when writing a dataset, e.g. from a table consisting of a set of record batches, there is no guarantee that the row order is preserved when reading the dataset....

Type: enhancement
Component: C++

Test runs of parquet-hadoop with `-Dhadoop.version=3.4.0` fail because there's a logback jar on the classpath, which screws things up (mostly seemingly because it suddenly logs at debug) HADOOP-19084 should have...

Component: Hadoop
Component: Parquet
Priority: Major
Type: bug

@rdblue  pointed me to  which provides non-native implementations of compression codecs. It claims to be much faster than native wrappers that parquet uses. This Jira is to track the work...

Component: Parquet
Priority: Major
Type: enhancement

Parquet MR 1.8.2 does not support reading row groups which are larger than 2 GB. See:https://github.com/apache/parquet-mr/blob/parquet-1.8.x/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java#L1064 We are seeing this when writing skewed records. This throws off the estimation of...

Component: Java
Component: Parquet
Priority: Major
Type: bug

command result as follow: parquet-tools bloom-filter BloomFilter.snappy.parquet row-group 0: bloom filter for column id: NONE bloom filter for column uuid: Hash strategy: block Algorithm: block Compression: uncompressed Bitset size: 1048576...

Component: Parquet
Priority: Minor
Type: enhancement