Prashant Singh
Prashant Singh
### What changes were proposed in this pull request? We should propagate the row count stats in SizeInBytesOnlyStatsPlanVisitor if available. Row counts are propagated from connectors to spark in case...
### About the change presently spark queries fails when using S3fileIO & GlueCatalog when being used with KryoSerializer ref https://github.com/apache/iceberg/issues/5414#issuecomment-1204319969. This happens because Immutable map part of S3FileIO properties is...
## Which issue does this PR close? Closes #.https://github.com/apache/datafusion-comet/issues/198 ## Rationale for this change ## What changes are included in this PR? Support for BNLJ, this supports is missing in...
### What is the problem the feature request solves? Datafusion supports Cross and NestedLoop joins as well : https://docs.rs/datafusion-physical-plan/36.0.0/datafusion_physical_plan/joins/index.html It will really nice if we can add support for it...
### About the change presently when using brotli as compression codec for parquet it fails with ``` Caused by: org.apache.parquet.hadoop.BadConfigurationException: Class org.apache.hadoop.io.compress.BrotliCodec was not found at org.apache.parquet.hadoop.CodecFactory.getCodec(CodecFactory.java:243) at org.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.(CodecFactory.java:144) at...
### About the changes This changes attempts to pre-fetch row groups while reading parquet files. This is first part of the changes proposed in https://github.com/apache/iceberg/issues/647 cc @jackye1995 @rdblue
## About the change The UUID type in the parquet writer expects ByteBuffer rather than UUID otherwise writer fails with : ``` class java.util.UUID cannot be cast to class [B...
### About the change Solves https://github.com/apache/iceberg/issues/13005 PR for resuming the work for scan planning (previous pr : https://github.com/apache/iceberg/pull/11180), presently since the client is not ready it can't be used even...
### About the change Make _maxRecordsPerMicrobatch_ a soft limit, as the cases like for ex max number of records is less than the totalRecords of a file, would expect us...
### About the change presently, we retry on 502 / 504 as well [here](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/ExponentialHttpRequestRetryStrategy.java#L86), we have a spec definition stating [here](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L1070) that when these status are thrown the commit status...