pandas
pandas copied to clipboard
BUG: Fix return type of loc/iloc
- [X] closes #60600
- [X] Tests added and passed if fixing a bug or adding a new feature
- [X] All code checks passed.
- [ ] Added type annotations to new arguments/methods/functions.
- [X] Added an entry in the latest
doc/source/whatsnew/v3.0.0.rstfile if fixing a bug or adding a new feature.
Description of Linked Issue
loc/iloc inconsistently returns dtype. For example,
>>> import pandas as pd
>>> df = pd.DataFrame([['a', 1., 2.], ['b', 3., 4.]])
>>> df.loc[0, [1, 2]]
1 1.0
2 2.0
Name: 0, dtype: object
>>> df[[1, 2]].loc[0]
1 1.0
2 2.0
Name: 0, dtype: float64
>>> df.loc[[0, 1], 1]
0 1.0
1 3.0
Name: 1, dtype: float64
This behaviour seems to happen following the below sequence:
- For axis=0,
BlockManager.fast_xs()returns a cross-section ofdf, determining thedtypeasobject, sincedf.loc[0,:]is supposed to include'a'. - For axis=1,
NDFrame._reindex_with_indexers()returns the result, not additionally inferring the dtype of the result.
Proposed Solution
Based on the above examples, we can conclude that this issue only apprears where axis[0]=int amd axis[1]=list/slice - loc[int/slice].
Therefore, I'd like to propose to add the below codes to additionally infer the dtype after the column selection.
@final
def _getitem_lowerdim(self, tup: tuple):
...
# This is an elided recursive call to iloc/loc
out = getattr(section, self.name)[new_key]
# Re-interpret dtype of out.values for loc/iloc[int, list/slice].
# GH60600
if (
i == 0
and isinstance(key, int)
and isinstance(new_key, (list, slice))
):
out = out.infer_objects()
return out
Thanks!