MithunR
MithunR
Followup to #6288. The `spark-rapids` plugin currently supports querying `map` scalars only with key scalars. Map vectors can be queried with both scalar and vector keys. So the following query...
The Spark Executor JVM crashes when `UPDATE` command is run on a Delta table on Databricks 10.4. This does not appear to break on Apache Spark (3.2.1, at least). ##...
In its current implementation, the `Scalar` Java class does not consider the `scale` of a scalar value, when comparing two `DECIMAL` scalars. Here is the section of `Scalar.equals()` that compares...
**Description** This was uncovered in [Spark tests](https://github.com/NVIDIA/spark-rapids/pull/9366) that compare Parquet read/write compatibility with [`fastparquet`](https://fastparquet.readthedocs.io/en/latest/index.html). The last row of a String column written with `fastparquet` seems to be interpreted by CUDF...
This bug is to track a (possible) misinterpretation of Parquet list schemas when stored in a legacy format. This is a follow-up to https://github.com/rapidsai/cudf/pull/13277. This is specific to rules #3...
## Description This commit adds a new `strings::contains()` overload that allows for the search of multiple scalar search targets in the same call. The trick here is that a new...
**Description** This is the result of auditing [SPARK-47247](https://github.com/apache/spark/commit/e310e76e63f). SPARK-47247 changes the target partition size for AQE partition coaflescing (`spark.sql.adaptive.advisoryPartitionSizeInBytes`) from the default of `64MB` to the value of `spark.sql.adaptive.coaelscePartitions.minPartitionSize` (whose...
Fixes #11029. Some tests in subquery_test.py fail when run with ANSI mode enabled, because certain array columns are accessed with invalid indices. These tests predate the availability of ANSI mode...
(Partially) fixes #11031. This PR addresses tests that fail on Spark 4.0 in the following files: 1. `integration_tests/src/main/python/datasourcev2_read_test.py` 2. `integration_tests/src/main/python/expand_exec_test.py` 3. `integration_tests/src/main/python/get_json_test.py` 4. `integration_tests/src/main/python/hive_delimited_text_test.py` 5. `integration_tests/src/main/python/logic_test.py` 6. `integration_tests/src/main/python/repart_test.py` 7. `integration_tests/src/main/python/time_window_test.py`
`get_json_test.py::test_get_json_object_quoted_question` fails on Spark 4 with mismatched output: ``` ---------------------------- Captured stderr setup ----------------------------- 2024-07-02 23:57:30 INFO Running test 'src/main/python/get_json_test.py::test_get_json_object_quoted_question[DATAGEN_SEED=1719964646, TZ=UTC]' ------------------------------ Captured log setup ------------------------------ INFO __pytest_worker_logger__:spark_init_internal.py:256 Running test...