koalas icon indicating copy to clipboard operation
koalas copied to clipboard

Supporting allows_duplicate_labels for Series and DataFrame

Open itholic opened this issue 4 years ago • 0 comments

pandas experimentally started to support allows_duplicate_labels when creating Series or DataFrame to control whether the index or columns can contain duplicate labels from pandas 1.2.

In [1]: pd.Series([1, 2], index=['a', 'a'])
Out[1]:
a    1
a    2
Length: 2, dtype: int64

In [2]: pd.Series([1, 2], index=['a', 'a']).set_flags(allows_duplicate_labels=False)
...
DuplicateLabelError: Index has duplicates.
      positions
label
a        [0, 1]

They said,

This is an experimental feature. Currently, many methods fail to propagate the allows_duplicate_labels value. In future versions it is expected that every method taking or returning one or more DataFrame or Series objects will propagate allows_duplicate_labels.

Thus, I think Koalas also better to prepare supporting this feature.

itholic avatar Jan 05 '21 07:01 itholic