Haejoon Lee

Results 18 issues of Haejoon Lee

### What changes were proposed in this pull request? Followup for https://github.com/apache/spark/pull/37294, to improve test coverage by adding more tests. ### Why are the changes needed? To improve the test...

CORE
PYTHON
PANDAS API ON SPARK

Currently Series doesn't support the comparison to list-like Python objects such as `list`, `tuple`, `dict`, `set`. ```python >>> kser 0 1 1 2 2 3 dtype: int64 >>> kser ==...

So far, Koalas doesn't support list-like Python objects for Series binary operations. ```python >>> kser 0 1 1 2 2 3 3 4 4 5 5 6 Name: x, dtype:...

This PR proposes `MultiIndex.equal_levels`. ```python >>> kmidx1 = ks.MultiIndex.from_tuples([("a", "x"), ("b", "y"), ("c", "z")]) >>> kmidx2 = ks.MultiIndex.from_tuples([("b", "y"), ("a", "x"), ("c", "z")]) >>> kmidx1.equal_levels(kmidx2) True ```

This PR proposes `DataFrame.lookup`. ```python >>> kdf = ks.DataFrame({'A': [3, 4, 5, 6, 7], ... 'B': [10.0, 20.0, 30.0, 40.0, 50.0], ... 'C': ['a', 'b', 'c', 'd', 'e']}) >>> kdf...

pandas support datetime64 or datetime64tz dtypes for `std` from pandas 1.2 (https://github.com/pandas-dev/pandas/pull/37436) And it returns Timedelta Series which is Koalas currently cannot support. ```python >>> pdf = pd.DataFrame( ... {...

enhancement

pandas experimentally started to support `allows_duplicate_labels` when creating `Series` or `DataFrame` to control whether the index or columns can contain duplicate labels from [pandas 1.2](https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.2.0.html#optionally-disallow-duplicate-labels ). ```python In [1]: pd.Series([1,...

enhancement

When creating `Series` which contains only `np.nan`, they're unexpectedly casted to `None` like the below. ```python >>> ks.Series([np.nan, np.nan]) 0 None 1 None Name: 0, dtype: object ``` but pandas...

bug

Assuming that we have Series like the below. ```python >>> pser = pd.Series([1, 2, 3]) >>> kser = ks.from_pandas(pser) ``` and there is an issue for some cases with arithmetic...

bug

Inspired by #1261 , we'd better to have the list of APIs which are not planned to implement to the official docs.