mwish

Results 326 comments of mwish

Generally this method is ok for me, but I'm not so familiar with the "common solutions" here. I'll dive into Presto/ClickHouse to see the common pattern here

Sorry for late replying, would you mind fix or let me fix this? @biljazovic

The problem is `Peek` and `Read` both calls `SetBufferSize`, however: 1. `Read` implicit says that, when `SetBufferSize` or read, the previous buffer is not being required. In this scenerio, `bytes_buffered_`...

About (1) some optimization will be included later, see: 1. https://github.com/apache/arrow/pull/37868 (This patch should be revisited ) Seems we can enable larger prefetch-depth to arrow fetching multiple files concurrently.

General ideas LGTM, it's a bit late in my tz and I will take a careful round tomorrow

I’was busy previously. Sorry for delaying. I found it a bit hard for Parquet to optimize tensor, maybe the problem is that rep-def levels for tensor / fixed length byte-array....

https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists Without legacy: > The element field encodes the list's element type and repetition. Element repetition must be required or optional. With backward capability: Some existing data does not include...

Parquet schema is too tricky for me, I'd try to take a look at https://github.com/apache/parquet-java/blob/aec7bc64dffa373db678ab2fc8b46565b4c011a5/parquet-thrift/src/main/java/org/apache/parquet/thrift/ThriftSchemaConvertVisitor.java#L220 tomorrow...

I've check Java related code: https://github.com/apache/parquet-java/blob/aec7bc64dffa373db678ab2fc8b46565b4c011a5/parquet-column/src/main/java/org/apache/parquet/schema/Type.java https://github.com/apache/parquet-java/blob/aec7bc64dffa373db678ab2fc8b46565b4c011a5/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L145 I'll dive into it this after noon