MithunR comments

Results 156 comments of


                                            MithunR

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

@NVnavkumar, I was wondering if you might take another look at this one.

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

Build

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

There seems to be an error on Spark 3.3, where the expected exception isn't thrown. It's taking a bit of time to repro. I'll update here once I have something.

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

I think I've addressed the Databricks failure. I'll kick off another build, and request the reviewers for another round.

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

Build

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

@NVnavkumar, I've fixed the last nit. Does this look agreeable?

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

Thank you for reviewing, @NVnavkumar. This change has now been merged.

[BUG] numpy2 fail fastparquet cases: numpy.dtype size changed

It's been a while since I worked on `fastparquet`. My understanding is that we don't currently have a direct dependency on `numpy` (beyond the transitive dependency via `fastparquet`). One option...

[BUG] numpy2 fail fastparquet cases: numpy.dtype size changed

@pxLi: Sorry for the delayed response. I suspect that the Python version will not have a material effect on the behaviour of fastparquet. I'm +1 on this change to update...

[BUG] `test_range_running_window_float_decimal_sum_runs_batched` fails intermittently

Here is a sampling of the most egregious diffs in the output: ``` -Row(p=None, oby=None, short_double_sum=None, double_sum=3.320125371694111e-34, short_float_sum=None, float_sum=1.425792694093375e+33, dec_sum=None) +Row(p=None, oby=None, short_double_sum=None, double_sum=3.3201253716941104e-34, short_float_sum=None, float_sum=1.425792694093375e+33, dec_sum=None) -Row(p=None, oby=None, short_double_sum=None,...