seaborn
seaborn copied to clipboard
Boxenplot styling differs depending on the k_depth option
Hi,
I've been an avid user of boxenplot
, and it's a great function!
I noticed that there were some improvements to boxenplot
in #2086 that were incorporated into release v0.11. Now, with the default k_depth='tukey'
the boxes are "sequentially colored from the input color all the way to white". However, while this new styling also looks to be applied for 'trustworthy', the old styling is still used for 'proportion' and 'full'.
Is this intended behavior? Would it be possible to either clarify in the documentation that the styling may differ depending on the k_depth
, to add the option to have this sequential hue gradient for all options for k_depth
, or the option to turn it on/off since it's somewhat tricky to work with the resulting cmap
of the PatchCollections
?
With seaborn=='0.11.1' and matplotlib=='3.3.4' this is what I currently see:
tips = sns.load_dataset("tips")
k_depth_methods = ["proportion", "tukey", "trustworthy", "full"]
fig, axes = plt.subplots(2,2, figsize=(12,12), sharex=True, sharey=True)
for ax, k in zip(axes.flatten(), k_depth_methods):
sns.boxenplot(
x="day", y="total_bill", hue="time", data=tips, linewidth=2.5, k_depth=k, ax=ax
)
ax.set_title(k, y=1.05)
ax.legend(loc="upper left", bbox_to_anchor=(0,1.05))
sns.despine()
I don't think the k_depth methods use different styling; what's happening is that there is a color ramp with the same start and endpoints and proportion/full methods produce more boxes, so you get a shallower slope of the gradient. Also the tail boxes tend to be very thin and so it's hard to see their facecoloor with the default line thickness.
I don't think the styling was changed on purpose with the work for 0.11, but @MaozGelbart led that work and may be able to comment. If it's easy for you to share a before/after picture that might be helpful.
I think an option for disabling the gradient would be reasonable.
I don't think the k_depth methods use different styling; what's happening is that there is a color ramp with the same start and endpoints and proportion/full methods produce more boxes, so you get a shallower slope of the gradient. Also the tail boxes tend to be very thin and so it's hard to see their facecoloor with the default line thickness.
I don't think the styling was changed on purpose with the work for 0.11, but @MaozGelbart led that work and may be able to comment.
All true.
Thanks @MaozGelbart!
So I think the path forward is to add a parameter to disable the gradient across the quantile levels. (Or usually I would do this as a parameter to enable the gradient with a default of True
). The one thing to be mindful of is that I would like, in the future, to have the possibility of a semantic mapping dimension that maps to color intensity and can be crossed with hue, so the name for this new parameter should be chosen carefully not to collide with whatever that API ends up being.
Ah, thanks for clarifying. I guess it's hard to see the white when there are lots of tiny boxes. If you have linewidth=0
in the call to sns.boxenplot
it also suppresses the median, but once I turned off the linewidth manually you can now see the ramp for full
.
I don't know if this is intended behavior, but does the gradient ramp differently for each boxen or for each hue factor? For tukey you can see that the second blue boxen is lighter than that of the second orange boxen, which I guess makes conceptual sense but then the boxes at a given k-depth (aka percentile) could be different amounts of lightness (as you see for Friday/full)?
tips = sns.load_dataset("tips")
k_depth_methods = ["proportion", "tukey", "trustworthy", "full"]
fig, axes = plt.subplots(2,2, figsize=(12,12), sharex=True, sharey=True)
for ax, k in zip(axes.flatten(), k_depth_methods):
sns.boxenplot(
x="day", y="total_bill", hue="time", data=tips, k_depth=k, ax=ax, showfliers=False
)
ax.set_title(k, y=1.05)
ax.legend(loc="upper left", bbox_to_anchor=(0,1.05))
for a in ax.collections:
if isinstance(a, mpl.collections.PatchCollection):
# remove line surround each box
a.set_linewidth(0)
sns.despine()
I think it's done on a per-boxen basis: https://github.com/mwaskom/seaborn/blob/master/seaborn/categorical.py#L1947