asfimport
asfimport
Design a general extension mechanism for Parquet with the following requirements: - old readers must be able to parse files written with the extension - new readers must be able...
[_Replicated from mailing list_](https://lists.apache.org/thread/xot5f3ghhtc82n1bf0wdl9zqwlrzqks3) Arrow recently introduced [FixedShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor) and [VariableShapeTensor](https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor) canonical extension types that use FixedSizeList and StructArray(List, FixedSizeList) as storage respectfully. These are targeted at machine learning and scientific...
As we discussed on the mailing list, parquet.thrift uses the terms Record and Row to mean the same thing Using consistent terminology will reduce the potential for confusion The consensus...
There is currently no MIME type registered for Parquet. Perhaps this is intentional. If it is not intentional, I suggest steps be taken to register a MIME type with IANA. ...
Currently, the specification of `ColumnIndex` in `parquet.thrift` is inconsistent, leading to cases where it is impossible to create a parquet file that is conforming to the spec. The problem is...
The spec for DICTIONARY_ENCODING states that: > If the dictionary grows too big, whether in size or number of distinct values, the encoding will fall back to the plain encoding....
The parquet format specification doesn't say whether a Parquet file having columns with the same name (in the same group node, so really exactly the same name) is valid. I.e.,...
I have been running into a bug due to `parquet-format` and `parquet-format-structures` both defining the `org.apache.parquet.format.Util` class but doing so inconsistently. Examples of this are several methods which include a...
Currently, our Parquet can use BloomFilter for any physical types. However, when BloomFilter apply on float: 1. What does +0 -0 means? Are they equal? 1. Should qNaN sNaN written...
Each Instance of ColumnFilterPredicate stores the filter values in toString variable eagerly. Which is not useful ```java static abstract class ColumnFilterPredicate implements FilterPredicate, Serializable { private final Column column; private...