Haejoon Lee comments

Results 95 comments of


                                            Haejoon Lee

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

- Is the change of python/pyspark/pandas/resample.py safe? It breaks the previous behavior, so if we plan to release other minor release (Spark 3.6.0) this should not be included. - What...

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

We should not bring any breaking change. Let me address them. Thanks, @dongjoon-hyun for double checking.

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

Oh, wait. I just remembered that we just follow the Pandas behavior and separately mention the breaking changes into [release note](https://github.com/apache/spark/blob/master/python/docs/source/migration_guide/pyspark_upgrade.rst). ``` - In Spark 4.0, it is recommended to...

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

Just updated to resample work in old Pandas as well. I think we can just make it as deprecate for now to avoid breaking the existing pipeline. (Also updated the...

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

Thank you so much all for the review!

[SPARK-46820][PYTHON] Fix error message regression by restoring `new_msg`

Thanks @HyukjinKwon for reviewing. Just fixed regressions from past few PRs, and updated the PR title & description accordingly.

Attribute Error: module 'numpy' has no attribute 'bool'

ditto. Please see https://github.com/databricks/koalas/issues/2223#issuecomment-1789845928.

[SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connect` tests on GA

Looks fine for now, but maybe in the future we might need to separate this into more parts instead of just rebalancing if the number of test will be increased.

[SPARK-48710][PYTHON] Use NumPy 2.0 compatible types

Let's use the default PR template: ``` ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?...

[SPARK-48710][PYTHON] Use NumPy 2.0 compatible types

Oh, okay seems like the NumPy is upgraded their major version recently (2024-06-17): [Release Note](https://github.com/numpy/numpy/releases/tag/v2.0.0). @HyukjinKwon Maybe should we upgrade the minimum NumPy support to 2.0.0 as we did for...