asfimport

Results 328 issues of asfimport

[Gorilla](https://www.vldb.org/pvldb/vol8/p1816-teller.pdf) is a de facto encoding algorithm for float numbers, it has been used by many time series database such as InfluxDB, TimescaleDB for a while. For now Parquet only...

Priority: Major
Type: enhancement

I often needs to create tens of milliions of small dataframes and save them into parquet files. all these dataframes have the same column and index information. and normally they...

Priority: Major
Type: enhancement

Hi parquet-mr teams, When I reading the parquet writer in [ParquetFileWriter](  I find that there is no `column metadata` behind the each column chunk described in the [Parquet-Format] ( and...

Priority: Major
Type: bug

 Int8 and Int16 are not supported as basic in previos version. Using 4 bytes to store int8 seems not a good idea, which means requiring more storage and read and...

Priority: Major
Type: enhancement

Since this is a parquet-specific encoder, it would be good to have a more complete description of the encoding/decoding, so that implementations have a easier time implementing it. **Reporter**: [Jorge...

Priority: Minor
Type: enhancement

The Nested Encoding section of documentation doesn't escape the `_` character, so it looks as following: Two encodings for the levels are supported BIT_PACKED and RLE. Only RLE is now...

Priority: Minor
Type: enhancement

In the example using delta-encoded, encoding [1, 2, 3, 4, 5], we state that ```java The final encoded data is: header: 8 (block size), 1 (miniblock count), 5 (value count),...

Priority: Minor
Type: bug

Currently ColumnMetaData only contains bloom_filter_offset, which points to BloomFilterHeader followed by the bloom filter data. This solution is not optimal during reading, as two IO reads are needed once we...

Priority: Major
Type: enhancement

As understand it Parquet is a write once thing. So mutating data inside Parquet files is not an option. Now there is a new cross EU law coming in effect...

Priority: Major
Type: enhancement

It would be great if Parquet would store `dictionary entries` for columns marked to be used for joins. When a column is used for a join (it could be a...

Priority: Major
Type: enhancement