pandas_multiindex_tutorial icon indicating copy to clipboard operation
pandas_multiindex_tutorial copied to clipboard

Dataframe index has not to be unique

Open feralfer opened this issue 3 years ago • 1 comments

Hie, in your tuto you write "The index of a DataFrame is a set (i.e. each element is only represented once) that consists of a label for each row". IMHO this is not true. 2 rows may have the same label example : data = [{'A': 'x', 'B': 'y', 'C':'z'}, {'A':'x', 'B': 'u', 'C': 'v'}] df = pandas.DataFrame(data) df.set_index(["A"])

feralfer avatar Jun 21 '22 15:06 feralfer

You're right - as it stands now, more accurate phrasing would be that the index of a DataFrame should be a set. While duplicating a value in the index is possible, it's both very slow and not well supported by the suite of index-based pandas functionality (read: expect errors).

ZaxR avatar Jun 21 '22 15:06 ZaxR