pandas icon indicating copy to clipboard operation
pandas copied to clipboard

DOC/TST: Clarify Series.str.get supports passing hashable label

Open xr-chen opened this issue 3 years ago • 12 comments

  • [ ] closes #47911 (Replace xxxx with the Github issue number)
  • [x] Tests added and passed if fixing a bug or adding a new feature
  • [x] All code checks passed.
  • [ ] Added type annotations to new arguments/methods/functions.
  • [x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

xr-chen avatar Aug 01 '22 20:08 xr-chen

NO! https://github.com/pandas-dev/pandas/issues/47911#issuecomment-1201676711

BeRT2me avatar Aug 01 '22 20:08 BeRT2me

i is only positional in the documentation... it's definitely not strictly positional in current functionality. https://github.com/pandas-dev/pandas/blob/14de3fd9ca4178bfce5dd681fa5d0925e057c04d/pandas/core/strings/object_array.py#L249:L257

BeRT2me avatar Aug 01 '22 21:08 BeRT2me

I am starting to agree with @BeRT2me, I don't think we can change this outright.

I think the fact that str.get works for object with a dict element series was just poorly described as there were past issues explaining some usage of this functionality. e.g. https://github.com/pandas-dev/pandas/issues/20671

I would be +1 to keep supporting this (although there's cognizant dissonance with str.get working on dict elements). I think clarifying pos can be a hashable dict label and adding unit tests would be good instead.

mroeschke avatar Aug 01 '22 21:08 mroeschke

Do you want to deprecate or add this to the documentation?

phofl avatar Aug 01 '22 21:08 phofl

How Series.str works for anything (list, tuple, dict, set, etc.) that isn't a string hasn't made much sense logically for a while.

In my opinion, mirroring how dict.get does things by allowing a default value to be specified if the key/index isn't found would be a positive change.

Not forgetting to update the documentation to match would be nice as well...

BeRT2me avatar Aug 01 '22 21:08 BeRT2me

I think if we have a list of dictionaries like

lst_of_dict = [
   {"name": "Hello", "value": "World"},
   {"name": "Goodbye", "value": "Planet"}
]

we could construct a DataFrame by pd.DataFrame(lst_of_dict), and DataFrame supports pass default value to DataFrame.get. I have no idea about whether we should deprecate pass non-integer type in Series.str.get or not.

xr-chen avatar Aug 01 '22 21:08 xr-chen

Should _str_get supports default value argument as @BeRT2me suggested?

xr-chen avatar Aug 02 '22 20:08 xr-chen

Should _str_get supports default value argument as @BeRT2me suggested?

I would save that for another issue to discuss

mroeschke avatar Aug 08 '22 18:08 mroeschke

Appears some doc checks are failing

mroeschke avatar Aug 08 '22 18:08 mroeschke

Honestly, I don't know what's going wrong from the Traceback of the failed check.

xr-chen avatar Aug 08 '22 19:08 xr-chen

Might want to merge in the main branch into this PR. I believe there was some doc failures that were fixed recently.

mroeschke avatar Aug 08 '22 20:08 mroeschke

Hello @xr-chen! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2022-08-09 01:03:20 UTC

pep8speaks avatar Aug 08 '22 20:08 pep8speaks

Thanks @xr-chen

mroeschke avatar Aug 15 '22 17:08 mroeschke