Gang Wu

Results 304 comments of Gang Wu

Let me try to enable ASAN to reproduce it. EDIT: Unfortunately, I cannot reproduce this with or without ASAN.

I remember there was about 3% speedup reading a sample parquet file.

> > (e.g. if they happen to be exposing thrift structures as a public API) > > what do we know here about this? which apps do/don't? @steveloughran There is...

IMO, it would be better to be in sync with the version used in the current `parquet-format`.

I think the process should be: - merge the parquet-format PR for 0.22.0 - release the next parquet-format version - bump the released parquet-format version in parquet-java - bump the...

Apache Parquet Format 2.12.0 is released: https://github.com/apache/parquet-format/releases/tag/apache-parquet-format-2.12.0 Let me merge this first. Then I'll bump the parquet-format version before releasing parquet-java. Thanks @vinooganesh!

I'm not an expert in python. Just want to ask a general question on whether we should enforce using the merge script for parquet-java (or other parquet-xxx projects). It has...

There is a similar script in Apache ORC: https://github.com/apache/orc/blob/main/dev/merge_orc_pr.py. It can help keep the PR description, update JIRA state and backport commit to other branches in a single shot. Perhaps...

> In Iceberg we create backports by hand. We checkout the branch, and backport using cherry-picking the commits, and create a PR. Can I ask how you resolve merge conflicts...

If might be useful to have multiple row groups.