Gang Wu
Gang Wu
Let me try to enable ASAN to reproduce it. EDIT: Unfortunately, I cannot reproduce this with or without ASAN.
I remember there was about 3% speedup reading a sample parquet file.
> > (e.g. if they happen to be exposing thrift structures as a public API) > > what do we know here about this? which apps do/don't? @steveloughran There is...
IMO, it would be better to be in sync with the version used in the current `parquet-format`.
I think the process should be: - merge the parquet-format PR for 0.22.0 - release the next parquet-format version - bump the released parquet-format version in parquet-java - bump the...
Apache Parquet Format 2.12.0 is released: https://github.com/apache/parquet-format/releases/tag/apache-parquet-format-2.12.0 Let me merge this first. Then I'll bump the parquet-format version before releasing parquet-java. Thanks @vinooganesh!
I'm not an expert in python. Just want to ask a general question on whether we should enforce using the merge script for parquet-java (or other parquet-xxx projects). It has...
There is a similar script in Apache ORC: https://github.com/apache/orc/blob/main/dev/merge_orc_pr.py. It can help keep the PR description, update JIRA state and backport commit to other branches in a single shot. Perhaps...
> In Iceberg we create backports by hand. We checkout the branch, and backport using cherry-picking the commits, and create a PR. Can I ask how you resolve merge conflicts...
If might be useful to have multiple row groups.