orange3 icon indicating copy to clipboard operation
orange3 copied to clipboard

Violin Plot with data subset

Open robertcv opened this issue 3 years ago • 2 comments

What's your use case? I would like for the Box Plot widget to have the option of Data Subset like it is in the Scatter Plot widget. This option would add points to the drawn box plots to show where on the distribution the specific subset lies.

A simple mockup: Screenshot from 2021-10-13 10-18-20

robertcv avatar Oct 13 '21 08:10 robertcv

@robertcv, we discussed this today and found a solution that we think is better. Box plots are not point-based (or instance-based) visualizations. Violin plots are. The plot could (probably) easily show the subset by making the area corresponding to the entire data brighter, and then superimposing a darker plot for the subset.

janezd avatar Oct 22 '21 10:10 janezd

What about beeswarm plots?

They are arguably better as they clearly the number of points behind the distribution. https://datascience.stackexchange.com/questions/71709/how-is-the-beeswarm-plot-better-than-a-histogram

I prefer SHAP implementation (which doesn't add curvature) https://shap.readthedocs.io/en/latest/example_notebooks/api_examples/plots/beeswarm.html

https://github.com/eclarke/ggbeeswarm

franktoffel avatar Jul 14 '23 14:07 franktoffel