pandas icon indicating copy to clipboard operation
pandas copied to clipboard

DOC: Add docstrings for MultiIndex.levels and MultiIndex.codes

Open datapythonista opened this issue 2 years ago • 20 comments
trafficstars

xref #55148

Seems like those docstrings are empty, we should create them.

See the attributes section here: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.html

The docstring for `MultiIndex.levels should include information to make clear that levels are preserved even if the dataframe using the index doesn't contain all levels. See this page in the docs: https://pandas.pydata.org/docs/user_guide/advanced.html#defined-levels and this comment: https://github.com/pandas-dev/pandas/pull/55433#pullrequestreview-1663040010

datapythonista avatar Oct 07 '23 11:10 datapythonista

This is the second time I've brought up pr in an open source project, so I misunderstood what you meant, and I'll finish the issue again.

shiersansi avatar Oct 07 '23 11:10 shiersansi

It's normal, and the issue was difficult to follow, since it was a discussion, but I think the new issue explains better what needs to be done. If you have any question or you need help we are here to help. Thank you!

datapythonista avatar Oct 07 '23 11:10 datapythonista

We've got now the docstring for MultiIndex.levels, but the one for MultiIndex.codes is still missing. Labelling this as good first issue in case anyone wants to help.

datapythonista avatar Oct 22 '23 09:10 datapythonista

take

mileslow avatar Oct 23 '23 01:10 mileslow

Hi mileslow do you still need time for this task, or do you mind if I work on it?

@Rollingterminator1 go for it.

mileslow avatar Oct 29 '23 16:10 mileslow

Hi, is this issue already taken care of?

devanshi-code18 avatar Nov 09 '23 22:11 devanshi-code18

Hi, does this issue still need to be worked on?

sathyaanurag avatar Nov 14 '23 17:11 sathyaanurag

1. Ensure Correct Data Types:

Make sure that your categorical columns are indeed of the "category" type. You can convert a column to a categorical type using astype:

df['categorical_column'] = df['categorical_column'].astype('category')

2. Check for Null Values:

Ensure that there are no null values in the categorical columns, as this can sometimes affect grouping.

df['categorical_column'].isnull().sum()

If there are null values, you might need to handle them appropriately before performing group operations.

3. Understand Grouping Requirements:

Make sure you understand the requirements of your grouping operation. For example, if you are trying to group by intervals, ensure that your categorical column is defined with the appropriate intervals.

pd.cut(df['numeric_column'], bins=[0, 10, 20, 30])

4. Use Groupby Correctly:

When using groupby, ensure you are providing the correct column name or a list of column names. For example:

grouped_data = df.groupby('categorical_column')['numeric_column'].sum()

Or, for multiple grouping columns:

grouped_data = df.groupby(['categorical_column1', 'categorical_column2'])['numeric_column'].sum()

5. Check Pandas Version:

Ensure that you are using a recent version of pandas. Bugs are often fixed in newer releases. You can check your pandas version with:

import pandas as pd
print(pd.__version__)

If you're using an older version, consider upgrading:

pip install --upgrade pandas

6. Minimal, Complete, and Verifiable Example:

If the issue persists, try to create a minimal, complete, and verifiable example that reproduces the problem. This makes it easier for others to help diagnose and fix the issue.

If you can provide more details or a sample of your code and data, I might be able to give more specific advice. Additionally, checking the pandas documentation or community forums can sometimes provide insights into common issues or bug reports.

wasimtikki120 avatar Nov 15 '23 16:11 wasimtikki120

class MultiIndex: """ A multi-level, or hierarchical, index object for pandas DataFrame.

...

Attributes
----------
levels : list
    List of Index objects containing the unique values for each level of the MultiIndex.
codes : list
    List of arrays containing the codes that indicate the position of each element in the levels.

...

Examples
--------
>>> arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
>>> tuples = list(zip(*arrays))
>>> index = pd.MultiIndex.from_tuples(tuples, names=('first', 'second'))
>>> index
MultiIndex([('A', 1),
            ('A', 2),
            ('B', 1),
            ('B', 2)],
           names=['first', 'second'])

>>> index.levels
[Index(['A', 'B'], dtype='object', name='first'),
 Int64Index([1, 2], dtype='int64', name='second')]

>>> index.codes
[array([0, 0, 1, 1], dtype=int8),
 array([0, 1, 0, 1], dtype=int8)]
"""

def __init__(self, levels, codes):
    """
    Parameters
    ----------
    levels : list
        List of Index objects containing the unique values for each level of the MultiIndex.
    codes : list
        List of arrays containing the codes that indicate the position of each element in the levels.
    """
    self.levels = levels
    self.codes = codes

wasimtikki120 avatar Nov 15 '23 17:11 wasimtikki120

take

Arpan3323 avatar Nov 23 '23 09:11 Arpan3323

take

chethanc1011 avatar Dec 07 '23 13:12 chethanc1011

Hi, I would like to contribute.

dwk601 avatar Dec 15 '23 05:12 dwk601

Hi, looks like this has been inactive for a while so I'd like to try it

sjalkote avatar Apr 10 '24 22:04 sjalkote

take

sjalkote avatar Apr 10 '24 22:04 sjalkote

Ah it looks like there is already a docstring for MultiIndex.codes present in the main branch. Seems like this has already been fixed. https://github.com/pandas-dev/pandas/blob/b1525c4a3788d161653b04a71a84e44847bedc1b/pandas/core/indexes/multi.py#L1080-L1102

sjalkote avatar Apr 10 '24 22:04 sjalkote

take

sam-baumann avatar Apr 24 '24 17:04 sam-baumann

Looks like #57601 fixed this - can we close this?

sam-baumann avatar Apr 24 '24 17:04 sam-baumann

@datapythonista can we close this? Looks like was solved by #57601

sam-baumann avatar Apr 26 '24 20:04 sam-baumann

is the issue still open ?

GAuravY19 avatar May 11 '24 09:05 GAuravY19

is the issue still open ?

The docstrings have been added, but there are many more issues labeled with 'Docs' that we would appreciate your help on

Aloqeely avatar May 24 '24 15:05 Aloqeely