pandas DOC: fix docstring validation errors for pandas.Series

follow up on issues #56804, #59458 and #58063 pandas has a script for validating docstrings:

https://github.com/pandas-dev/pandas/blob/0cdc6a48302ba1592b8825868de403ff9b0ea2a5/ci/code_checks.sh#L155-L187

Currently, some methods fail docstring validation check. The task here is:

take 2-4 methods
run: scripts/validate_docstrings.py <method-name>
fix the docstrings according to whatever error is reported
remove those methods from code_checks.sh script
commit, push, open pull request

Example:

scripts/validate_docstrings.py pandas.Series.prod

pandas.Timestamp.tz_localize fails with the SA01 error

################################################################################
################################## Validation ##################################
################################################################################

2 Errors found for `pandas.Series.prod`:
        ES01    No extended summary found
        RT03    Return value has no description

Please don't comment take as multiple people can work on this issue. You also don't need to ask for permission to work on this, just comment on which methods are you going to work.

If you're new contributor, please check the contributing guide

Aug 24 '24 09:08 natmokval

I'll take these:

 -i "pandas.Series.sparse.fill_value SA01" \ 
 -i "pandas.Series.sparse.from_coo PR07,SA01" \ 
 -i "pandas.Series.sparse.npoints SA01" \ 
 -i "pandas.Series.sparse.sp_values SA01" \ 
 -i "pandas.Series.sparse.to_coo PR07,RT03,SA01" \

Aug 24 '24 12:08 ivonastojanovic

I'll take these:

 -i "pandas.Series.str.wrap RT03,SA01" \ 
 -i "pandas.Series.str.zfill RT03" \

Aug 24 '24 17:08 wenchen-cai

Working on these:

 -i "pandas.Series.str.match RT03" \ 
 -i "pandas.Series.str.normalize RT03,SA01" \ 
 -i "pandas.Series.str.repeat SA01" \ 
 -i "pandas.Series.str.replace SA01" \

Aug 24 '24 17:08 ivonastojanovic

I'll take these:

-i "pandas.Series.struct.dtypes SA01" \ 
-i "pandas.Series.to_markdown SA01" \

Aug 25 '24 03:08 githubalexliu

Here's a filtered list of pandas.Series docstring issues that still need to be addressed:

        ...
        -i "pandas.Series.dt.as_unit PR01,PR02" \
        ...
        -i "pandas.Series.dt.round PR01,PR02" \
        ...
        -i "pandas.Series.dt.unit GL08" \
        ...
        -i "pandas.Series.pad PR01,SA01" \
        ...

I went ahead and removed methods that were already claimed/addressed by open + merged PRs. (Last updated 9/2/2024)

Aug 25 '24 11:08 hlakams

I'll take these:

 -i "pandas.Series.pop SA01" \
 -i "pandas.Series.list.__getitem__ SA01" \
 -i "pandas.Series.list.flatten SA01" \
 -i "pandas.Series.list.len SA01" \
 -i "pandas.Series.reorder_levels RT03,SA01" \
 -i "pandas.Series.sparse.density SA01" \
 -i "pandas.Series.gt SA01" \
 -i "pandas.Series.lt SA01" \
 -i "pandas.Series.ne SA01" \
 -i "pandas.Series.prod RT03" \
 -i "pandas.Series.product RT03" \

Aug 25 '24 12:08 hlakams

I will take

-i "pandas.Series.dt.strftime PR01,PR02" \
        -i "pandas.Series.dt.to_period PR01,PR02" \
        -i "pandas.Series.dt.total_seconds PR01" \
        -i "pandas.Series.dt.tz_convert PR01,PR02" \
        -i "pandas.Series.dt.tz_localize PR01,PR02" \
        -i "pandas.Series.dt.unit GL08" \

Aug 26 '24 21:08 Pranav-Wadhwa

I'll take

 -i "pandas.Series.std PR01,RT03,SA01" \ 
 -i "pandas.Series.sem PR01,RT03,SA01" \

Aug 27 '24 02:08 james-magee

I followed the instructions and encountered this issue: I added 'See Also' to the function fill_value(self) in ./pandas/core/arrays/sparse/array.py. After running the command python3 scripts/validate_docstrings.py pandas.Series.sparse.fill_value, I received the message:

thang123456@MSI:/mnt/c/Users/ADMIN/Desktop/pandas/pandas$ python3 scripts/validate_docstrings.py pandas.Series.sparse.fill_value

################################################################################ ################# Docstring (pandas.Series.sparse.fill_value) ################# ################################################################################

Elements in data that are fill_value are not stored.

For memory savings, this should be the most common value in the array.

Examples

ser = pd.Series([0, 0, 2, 2, 2], dtype="Sparse[int]") ser.sparse.fill_value 0 spa_dtype = pd.SparseDtype(dtype=np.int32, fill_value=2) ser = pd.Series([0, 0, 2, 2, 2], dtype=spa_dtype) ser.sparse.fill_value 2

################################################################################ ################################## Validation ################################## ################################################################################

1 Errors found for pandas.Series.sparse.fill_value: SA01 See Also section not found I checked very carefully but still couldn't fix the error. Can someone help me understand what is going wrong?

Aug 27 '24 08:08 Tmthang1601

I will take:

-i "pandas.Series.dt.floor PR01,PR02" \
-i "pandas.Series.dt.ceil PR01,PR02" \

Aug 27 '24 18:08 Gesare5

I'll take these:

-i "pandas.Series.sparse PR01,SA01" \
-i "pandas.Series.sparse.to_coo PR07,RT03,SA01" \

Aug 27 '24 21:08 pol-rius

I'll take these:

-i "pandas.Series.dt.normalize PR01" \
-i "pandas.Series.dt.qyear GL08" \

Aug 27 '24 22:08 githubalexliu

@Tmthang1601 The pandas prefix is not needed for SparseDtype and SparseArray. Remove that prefix and the validation command should pass.

See Also
--------
SparseDtype : Dtype for sparse array.
SparseArray : Array of sparse data.

Aug 28 '24 00:08 hlakams

@Tmthang1601 The pandas prefix is not needed for SparseDtype and SparseArray. Remove that prefix and the validation command should pass.
See Also
--------
SparseDtype : Dtype for sparse array.
SparseArray : Array of sparse data.

@hlakams
Originally there was no line "See Also

SparseDtype : Dtype for sparse array. SparseArray : Array of sparse data." in the String Docs of the def fill_value function, I added it by mistake for the purpose of no more errors, I didn't think after I removed it it would go away, and I tried, of course it didn't go away

Aug 28 '24 01:08 Tmthang1601

@Tmthang1601 Can you push up your changes in a new PR?

Aug 28 '24 01:08 hlakams

@hlakams According to the instructions, you need to complete 2 to 4 methods and run the script successfully before pushing to a new PR, but I'm having trouble.

Aug 28 '24 01:08 Tmthang1601

@Tmthang1601 I'm not sure what the issue is, but try replacing lines 620:639 from https://github.com/pandas-dev/pandas/issues/59592#issuecomment-2311939867 with the following docstring:

        """
        Elements in `data` that are `fill_value` are not stored.

        For memory savings, this should be the most common value in the array.

        See Also
        --------
        SparseDtype : Dtype for sparse array.
        SparseArray : Array of sparse data.

        Examples
        --------
        >>> ser = pd.Series([0, 0, 2, 2, 2], dtype="Sparse[int]")
        >>> ser.sparse.fill_value
        0
        >>> spa_dtype = pd.SparseDtype(dtype=np.int32, fill_value=2)
        >>> ser = pd.Series([0, 0, 2, 2, 2], dtype=spa_dtype)
        >>> ser.sparse.fill_value
        2
        """

Run pre-commit once this change from https://github.com/pandas-dev/pandas/issues/59592#issuecomment-2313880479 is committed (assuming it was configured correctly) + address possible lint errors and you should be able to push up to your fork.

Aug 28 '24 02:08 hlakams

I will take these:

        -i "pandas.Series.dt.day_name PR01,PR02" \
        -i "pandas.Series.dt.month_name PR01,PR02" \

Sep 03 '24 03:09 yinglyu

I will take this - -i "pandas.Series.update PR07,SA01" \

Sep 03 '24 18:09 doshi-kevin

I'll work on this:

-i "pandas.Series.str.swapcase RT03" \

Sep 06 '24 19:09 blackhole-hoop

I'll work on these: -i "pandas.Series.dt.nanoseconds SA01" \\ -i "pandas.Series.dt.seconds SA01"

Sep 06 '24 19:09 chalky25

I'll work on this:

-i "pandas.Series.str.swapcase RT03" \

it seems that pandas.Series.str.swapcase has already been done.

Sep 06 '24 19:09 chalky25

Sorry, I am a first time contributor. May I know how to check whether something is done or not? I searched for the keyword "swapcase" on this page and didn't see anyone was working on this. @chalky25

Sep 06 '24 20:09 blackhole-hoop

Welcome to contributing, @blackhole-hoop. I also started three days ago.

I'm also not sure who fixed it or how it got fixed — because none of the merged commits mention it.

That said, I just tested it using the following command:

scripts/validate_docstrings.py pandas.Series.str.swapcase

And, I got the following in the result.


################################################################################
#################### Docstring (pandas.Series.str.swapcase) ####################
################################################################################

Convert strings in the Series/Index to be swapcased.

Equivalent to :meth:`str.swapcase`.

Returns
-------
Series or Index of objects
    A Series or Index where the strings are modified by :meth:`str.swapcase`.

See Also
--------
Series.str.lower : Converts all characters to lowercase.
Series.str.upper : Converts all characters to uppercase.
Series.str.title : Converts first character of each word to uppercase and
    remaining to lowercase.
Series.str.capitalize : Converts first character to uppercase and
    remaining to lowercase.
Series.str.swapcase : Converts uppercase to lowercase and lowercase to
    uppercase.
Series.str.casefold: Removes all case distinctions in the string.

Examples
--------
>>> s = pd.Series(['lower', 'CAPITALS', 'this is a sentence', 'SwApCaSe'])
>>> s
0                 lower
1              CAPITALS
2    this is a sentence
3              SwApCaSe
dtype: object

>>> s.str.lower()
0                 lower
1              capitals
2    this is a sentence
3              swapcase
dtype: object

>>> s.str.upper()
0                 LOWER
1              CAPITALS
2    THIS IS A SENTENCE
3              SWAPCASE
dtype: object

>>> s.str.title()
0                 Lower
1              Capitals
2    This Is A Sentence
3              Swapcase
dtype: object

>>> s.str.capitalize()
0                 Lower
1              Capitals
2    This is a sentence
3              Swapcase
dtype: object

>>> s.str.swapcase()
0                 LOWER
1              capitals
2    THIS IS A SENTENCE
3              sWaPcAsE
dtype: object

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.Series.str.swapcase" correct. :)

In other words, you use the script given by the original poster to check the docstring.

Sep 06 '24 21:09 ammar-qazi

i will work on this

"pandas.Series.str.lower RT03" \
"pandas.Series.str.center RT03,SA01" \
"pandas.Series.str.title RT03" \
"pandas.Series.str.lstrip RT03" \

Sep 13 '24 04:09 pratik305

i started contributing found some are already solve without mentioning. I try to run some code that already merge they also showing error like python scripts/validate_docstrings.py pandas.Series.str.swapcase

Result

################################################################################
#################### Docstring (pandas.Series.str.swapcase) ####################
################################################################################

Convert strings in the Series/Index to be swapcased.

Equivalent to :meth:`str.swapcase`.

Returns
-------
Series or Index of object

See Also
--------
Series.str.lower : Converts all characters to lowercase.
Series.str.upper : Converts all characters to uppercase.
Series.str.title : Converts first character of each word to uppercase and
    remaining to lowercase.
Series.str.capitalize : Converts first character to uppercase and
    remaining to lowercase.
Series.str.swapcase : Converts uppercase to lowercase and lowercase to
    uppercase.
Series.str.casefold: Removes all case distinctions in the string.

Examples
--------
>>> s = pd.Series(['lower', 'CAPITALS', 'this is a sentence', 'SwApCaSe'])
>>> s
0                 lower
1              CAPITALS
2    this is a sentence
3              SwApCaSe
dtype: object

>>> s.str.lower()
0                 lower
1              capitals
2    this is a sentence
3              swapcase
dtype: object

>>> s.str.upper()
0                 LOWER
1              CAPITALS
2    THIS IS A SENTENCE
3              SWAPCASE
dtype: object

>>> s.str.title()
0                 Lower
1              Capitals
2    This Is A Sentence
3              Swapcase
dtype: object

>>> s.str.capitalize()
0                 Lower
1              Capitals
2    This is a sentence
3              Swapcase
dtype: object

>>> s.str.swapcase()
0                 LOWER
1              capitals
2    THIS IS A SENTENCE
3              sWaPcAsE
dtype: object

################################################################################
################################## Validation ##################################
################################################################################

1 Errors found for `pandas.Series.str.swapcase`:
        RT03    Return value has no description

and in code.sh file there is no pandas.String.str. related code line is all str related doc fixed

Sep 13 '24 07:09 pratik305

I'll take these:

 -i "pandas.Series.str.rjust RT03,SA01" \ 
 -i "pandas.Series.str.rpartition RT03" \ 
 -i "pandas.Series.str.rstrip RT03" \

Sep 17 '24 17:09 mysticshirou

I want to work on these issues:

-i "pandas.Series.sparse.sp_values SA01,ES01" \ -i "pandas.Series.str.match ES01" \

Sep 22 '24 12:09 syeda-fajar

Hello! I am new to the pandas community. It seems like most of these are already taken. Is there any way to filter which methods have been run already?

Sep 26 '24 00:09 dhelms33

/Assign

Oct 22 '24 05:10 techie505

pandas pandas copied to clipboard

DOC: fix docstring validation errors for pandas.Series

Examples

@hlakams Originally there was no line "See Also

pandas
pandas copied to clipboard

@hlakams
Originally there was no line "See Also