Gang Wu
Gang Wu
@raunaqmorarka You can send an email to [email protected] to subscribe. If you don't want to subscribe, you may directly send an email to [email protected]. You can see https://lists.apache.org/[email protected] for reference.
@emkornfield @pitrou @mapleFU Would you mind taking a look? Thanks!
``` optional group a (LIST) { repeated group array (LIST) { repeated int32 array; } } ``` IMO, the root cause is that the current code recognizes the schema above...
> Our `ListToSchemaField` is like this part of the code https://github.com/apache/parquet-java/blob/aec7bc64dffa373db678ab2fc8b46565b4c011a5/parquet-avro/src/main/java/org/apache/parquet/avro/AvroSchemaConverter.java#L397-L421 > > Should we port the impl and testings in that? I think we are just missing check of...
I‘m using Hive schema, so that's why it is `array`. The file could be easily produced by Spark Sql like below: ``` package org.example import org.apache.spark.sql.SparkSession object ParquetTwoLevelList { def...
I will try to use parquet-java to create a minimal file and add it to parquet-testing. The file created by Hudi is too large due to a file-level bloom filter...
Gentle ping :) @emkornfield @pitrou @mapleFU
@emkornfield Thanks for your review! I've rebased it and the test failure in `R / rhub/ubuntu-gcc12:latest` is unrelated (observed the same error from other PRs). I'll merge it.
Thanks for adding this! This is a large PR that I need to take some time to review. It would be good if @emkornfield @gszadovszky could take a look to...
BTW, the level histogram might not be available when max_level is 0 because there is only single level (i.e. 0) and its count can be deduced from `num_values` of the...