plotly_express icon indicating copy to clipboard operation
plotly_express copied to clipboard

`category_orders` does not raise an error with incorrect column name

Open Rabeez opened this issue 5 years ago • 3 comments

I ran into this quite accidentally and am not sure whether this behaviour is mentioned in the documentation or not.

This code produces the correct output as expected

px.histogram(data_frame=iris, x='sepalLength', facet_col='species',
             category_orders={'species': ['versicolor','virginica','setosa']})

newplot (2)

Whereas, using an incorrect column name as the key for category_orders creates a plot which is identical to the one created when no ordering is specified.

px.histogram(data_frame=iris, x='sepalLength', facet_col='species',
             category_orders={'foo': ['versicolor','virginica','setosa']})

newplot (1)

In my opinion this should raise a ValueError similar to when an incorrect column is specified for the usual arguments (x, y, color etc).

Plotly 4.2.1 Python 3.7.4

EDIT: I just checked this with color instead of facet_col and the same issue is present obviously.

Rabeez avatar Nov 15 '19 10:11 Rabeez

This is actually on purpose, to make it easier to iterate quickly. For example if you correctly specify the order as virginica/setosa/versicolor and then you re-execute the cell with a filter on df such that no versicolors come out, it's annoying to have to go back to comment out the order and then add it back in on another pass when you change the filter again and setosa is out but versicolor is back in.

As a side-effect, it doesn't warn you or fail on typos, but I think that's a decent tradeoff.

nicolaskruchten avatar Nov 15 '19 20:11 nicolaskruchten

ah, sorry, I responded too quickly! you're saying that the key is not in the df. I think that's OK too, personally, so you can set a bunch of category orders all at once in a dict and just re-use the dict across many figures, but that argument is less strong, admittedly.

nicolaskruchten avatar Nov 15 '19 21:11 nicolaskruchten

@nicolaskruchten if I were making multiple figures where sharing the category order dict was an option I would probably have the same dataframe too, right. So having an error (or at least a stern warning 😂) about an incorrect column name would be useful rather than trying to figure out why the plot doesn't look right. Because let's be honest typos in column names are very common.

Rabeez avatar Nov 15 '19 21:11 Rabeez