asfimport
asfimport
**Reporter**: [Julien Le Dem](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=julienledem) / @julienledem #### Related issues: - [Upgrade Parquet to 1.9 (Fixes parquet sorting)](https://issues.apache.org/jira/browse/SPARK-13127) (blocks) - [H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks](https://github.com/apache/parquet-java/issues/2017)...
After PARQUET-160 was resolved, ColumnChunkPageWriter started using ConcatenatingByteArrayCollector. There are all data is collected in the List of byte[], before writing the page. No way to use direct memory for...
While the min/max record counts for page size check are configurable via ParquetOutputFormat.MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK and ParquetOutputFormat.MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK configs and via ParquetProperties directly, the min/max record counts for block size check are hard...
When I run `ParquetWriter.getDataSize()`, it works normally. But after I call `ParquetWriter.close()`, subsequent calls to ParquetWriter.getDataSize result in a NullPointerException. ``` java.lang.NullPointerException at org.apache.parquet.hadoop.InternalParquetRecordWriter.getDataSize(InternalParquetRecordWriter.java:132) at org.apache.parquet.hadoop.ParquetWriter.getDataSize(ParquetWriter.java:314) at FileBufferState.getFileSizeInBytes(FileBufferState.scala:83) ``` The...
ParquetOutputFormat should support custom OutputCommitter. There is a need to bypass current Hadoop functionality of writing output data under **_temporary** folder. Especially with AWS S3, there can be huge overhead...
Hi, Google release a new compression algorithm called Brotli. A new compression algorithm for the internet, as they claim. Currently Firefox and Chrome (both developer editions) already support Brotli, as...
PARQUET-99 added settings to control the min and max number of rows between size checks when flushing pages, and a setting to control whether to always use a static size...
The current Log class is intended to allow swapping out logger back-ends, but SLF4J already does this. It also doesn't expose as nice of an API as SLF4J, which can...
In Pig code, https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A private number "inputSchemaInternal" represents the schema. Setter and Getter are also provided ```Java 316 private Schema inputSchemaInternal=null; 328 /** 329 * This method is for...
**Reporter**: [Alex Levenson](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=alexlevenson) / @isnotinvain #### Related issues: - [ThriftRecordConverter throws NPE for unrecognized enum values](https://github.com/apache/parquet-java/issues/1867) (relates to) **Note**: *This issue was originally created as [PARQUET-351](https://issues.apache.org/jira/browse/PARQUET-351). Please see the [migration...