pandas_multiindex_tutorial
pandas_multiindex_tutorial copied to clipboard
Dataframe index has not to be unique
Hie, in your tuto you write "The index of a DataFrame is a set (i.e. each element is only represented once) that consists of a label for each row". IMHO this is not true. 2 rows may have the same label example : data = [{'A': 'x', 'B': 'y', 'C':'z'}, {'A':'x', 'B': 'u', 'C': 'v'}] df = pandas.DataFrame(data) df.set_index(["A"])
You're right - as it stands now, more accurate phrasing would be that the index of a DataFrame should be a set. While duplicating a value in the index is possible, it's both very slow and not well supported by the suite of index-based pandas functionality (read: expect errors).