Prashant Singh issues

Results 16 issues of


                                            Prashant Singh

[SPARK-39678][SQL] Improve stats estimation for v2 tables

### What changes were proposed in this pull request? We should propagate the row count stats in SizeInBytesOnlyStatsPlanVisitor if available. Row counts are propagated from connectors to spark in case...

SQL

AWS: Fix kryo serialization failure for S3 FileIO

### About the change presently spark queries fails when using S3fileIO & GlueCatalog when being used with KryoSerializer ref https://github.com/apache/iceberg/issues/5414#issuecomment-1204319969. This happens because Immutable map part of S3FileIO properties is...

spark

core

build

AWS

GCP

[WIP] Add support for BNLJ

## Which issue does this PR close? Closes #.https://github.com/apache/datafusion-comet/issues/198 ## Rationale for this change ## What changes are included in this PR? Support for BNLJ, this supports is missing in...

Support BroadcastNestedLoopJoinExec

### What is the problem the feature request solves? Datafusion supports Cross and NestedLoop joins as well : https://docs.rs/datafusion-physical-plan/36.0.0/datafusion_physical_plan/joins/index.html It will really nice if we can add support for it...

enhancement

Build: Fix BrotliCodec class not found failure when using brotli as compression codec

### About the change presently when using brotli as compression codec for parquet it fails with ``` Caused by: org.apache.parquet.hadoop.BadConfigurationException: Class org.apache.hadoop.io.compress.BrotliCodec was not found at org.apache.parquet.hadoop.CodecFactory.getCodec(CodecFactory.java:243) at org.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.(CodecFactory.java:144) at...

build

stale

[Parquet] Eagerly fetch row groups when reading parquet

### About the changes This changes attempts to pre-fetch row groups while reading parquet files. This is first part of the changes proposed in https://github.com/apache/iceberg/issues/647 cc @jackye1995 @rdblue

parquet

stale

[KafkaConnect] Fix RecordConverter for UUID and Fixed Types

## About the change The UUID type in the parquet writer expects ByteBuffer rather than UUID otherwise writer fails with : ``` class java.util.UUID cannot be cast to class [B...

KAFKACONNECT

Part 1: Support Scan Planning in Rest Client

### About the change Solves https://github.com/apache/iceberg/issues/13005 PR for resuming the work for scan planning (previous pr : https://github.com/apache/iceberg/pull/11180), presently since the client is not ready it can't be used even...

core

Spark: Make maxRecordPerMicrobatch a soft limit

### About the change Make _maxRecordsPerMicrobatch_ a soft limit, as the cases like for ex max number of records is less than the totalRecords of a file, would expect us...

spark

docs

REST: Revert #12818 and additionally stop retrying on 502/504

### About the change presently, we retry on 502 / 504 as well [here](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/ExponentialHttpRequestRetryStrategy.java#L86), we have a spec definition stating [here](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L1070) that when these status are thrown the commit status...

core