Haejoon Lee comments

Results 95 comments of


                                            Haejoon Lee

[SPARK-39942][PYTHON][PS] Need to verify the input nums is integer in nsmallest func

Kindly asking, any update on this PR ? IMHO, anyway the most important goal of the pandas API on Spark is to match all behaviors to pandas as much as...

[SPARK-39821][PYTHON][PS] Fix error during using DatetimeIndex

qq: Now the pandas 1.5.1 is released, is this any update related to this PR?

Support datetime for std

``` >>> std_result = kdf.select_dtypes(include=["timedelta"]).apply(lambda x: x.dt.total_seconds()).std() >>> print(std_result) Series([], dtype: float64) ``` Seems like the suggested method still returns empty Series. Btw, switching Koalas to [Pandas API on Spark](https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/index.html)...

pyspark dataframe coverting to koalas dataframe have different elements

@tsafacjo could you open a ticket on [Spark JIRA](https://issues.apache.org/jira/projects/SPARK/issues) and made a fix for [pyspark.pandas](https://github.com/apache/spark/tree/master/python/pyspark/pandas) instead of Koalas? This repository is no longer maintained since Koalas has been migrated into...

pyspark dataframe coverting to koalas dataframe have different elements

np! please feel free to ping me if you want to any help for contributing Apache Spark.

[SPARK-46894][PYTHON] Move PySpark error conditions into standalone JSON file

IIRC there was no major issue with managing the `json` itself. However, since we cannot integrate with the [error-classes.json](https://github.com/databricks/runtime/blob/master/common/utils/src/main/resources/error/error-classes.json) file on the JVM side - because we didn't want to...

[SPARK-46894][PYTHON] Move PySpark error conditions into standalone JSON file

> This is for a separate potential PR, but if it were possible to use the "main" error JSON files from Scala in PySpark automatically, would we want to do...

[SPARK-46894][PYTHON] Move PySpark error conditions into standalone JSON file

> Is this a reference to this command? Yes, so you might need to fix the description from https://github.com/apache/spark/blob/master/python/pyspark/errors_doc_gen.py#L44. https://github.com/apache/spark/blob/90e6c0cf2ca186d1a492af4dc995b8254aa77aae/python/pyspark/errors_doc_gen.py#L44

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

Yeah, Pandas fixes many bugs from Pandas 2.2.0 that brings couple of behavior changes 😢 Let me fix them. Thanks for the confirm!

[SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0

I believe now this PR completed to address all of Pandas 2.2.0 behavior. cc @HyukjinKwon @dongjoon-hyun FYI