seaborn icon indicating copy to clipboard operation
seaborn copied to clipboard

Dodge applied in different order with / without previous Agg

Open z626093820 opened this issue 7 months ago • 4 comments

When I use the bar and scattered dotted drawings for drawing, the scattered dotted diagram cannot be displayed correctly in the correct position,how can I slove this? THANK YOU!

import seaborn.objects as so
import seaborn as sns
from seaborn import axes_style,plotting_context
so.Plot.config.theme.update(
    plotting_context('paper',font_scale=1.4)
    | axes_style("ticks",rc={'axes.spines.top':False, 'axes.spines.right':False})
)
penguins=sns.load_dataset('penguins')
(
    so.Plot(penguins, x="island", y='bill_length_mm', color="species")
    .layout(size=(3, 3))
    .add(so.Bar(), so.Agg(), so.Dodge())
    .add(so.Range(), so.Est(errorbar="sd"), so.Dodge())
    .add(so.Dot(pointsize=2), so.Jitter(0.2), so.Dodge())
)

image

z626093820 avatar Nov 12 '23 13:11 z626093820

This looks like a bug probably the same underlying issue as https://github.com/mwaskom/seaborn/issues/3015.

You can work around it by sorting your dataframe on the column you're using to group by before plotting.

P.S. I formatted the code in your OP so that it is easier to read.

mwaskom avatar Nov 13 '23 13:11 mwaskom

You can also pass a Nominal scale with an explicit ordering:

(
    so.Plot(penguins, x="island",y='bill_length_mm', color="species")
    .add(so.Range(), so.Est(errorbar="sd"), so.Dodge())
    .add(so.Dot(pointsize=2), so.Jitter(0.2), so.Dodge())
    .add(so.Bar(), so.Agg(), so.Dodge())
    .scale(color=so.Nominal(order=penguins["species"].unique().tolist()))
)

So the question is ... why isn't the default ordering getting passed to the groupby operations?

mwaskom avatar Nov 14 '23 00:11 mwaskom

You can also pass a Nominal scale with an explicit ordering:

(
    so.Plot(penguins, x="island",y='bill_length_mm', color="species")
    .add(so.Range(), so.Est(errorbar="sd"), so.Dodge())
    .add(so.Dot(pointsize=2), so.Jitter(0.2), so.Dodge())
    .add(so.Bar(), so.Agg(), so.Dodge())
    .scale(color=so.Nominal(order=penguins["species"].unique().tolist()))
)

So the question is ... why isn't the default ordering getting passed to the groupby operations?

The GPT answer is : This problem may appear between the sorting of the data set and the group operation. By default, the Groupby does not retain the order order of the original data. The method of solving this problem may include sorting data before GroupBy, or clearly specifying sorting methods in the GroupBY operation.

z626093820 avatar Nov 14 '23 07:11 z626093820

Thanks, yeah seaborn already does manage order internally (and turns off sorting) when using groupbys, and that's happening in the order-sensitive move like Dodge. However, the default ordering is not computed until after the stat is applied, and here the stat is changing the default order specifically because the crossing with the x variable is "sparse" (e.g., if you change x to "sex" you get the same order with/without Agg).

Currently, the stat and any moves are handled in different parts of the codebase (dating back to an original design where there was a stronger distinction between them). The plan is to refactor things such that stats and moves are more interchangeable. I think it would be much cleaner to use a consistent default ordering across all of the transforms after that happens. So for now, unfortunately, I think it's necessary to pass a Nominal scale with an explicit ordering (as demonstrated above) to avoid this behavior.

mwaskom avatar Nov 15 '23 23:11 mwaskom