[Doc]: outdated links for violin/boxplot
Documentation Link
https://matplotlib.org/devdocs/gallery/statistics/boxplot_vs_violin.html
Problem
This example states
For more information on violin plots, the scikit-learn docs have a great section: https://scikit-learn.org/stable/modules/density.html
However, the linked scikit-learn page does not mention violin plots (and I could not find anything by searching their documentation).
Also the link for the boxplot document does not work for me in Chrome. It does work if I use https instead of http.
Suggested improvement
Remove the quoted line or find a new reference for violin plots.
Update the boxplot url from http to https.
Maybe switch to https://seaborn.pydata.org/generated/seaborn.violinplot.html ? That way we're also pointing people there.
I don't think the seaborn link is suitable. It's not explaining much about violinplots and nothing about boxplots. It's only describing the seaborn API (which strictly speaking matplotlib has no business with).
IMHO:
- We are generally missing an explanation of violin plots. A minimal explanation should be includeded in https://matplotlib.org/devdocs/api/_as_gen/matplotlib.axes.Axes.violinplot.html. That may be as small as linking to https://en.wikipedia.org/wiki/Violin_plot.
- We could have a more in-depth description. Atlassian has a good explanation Do we have a policy on linking to company websites?
- I'm questioning the usefullness of the "boxplot vs. violinplot" example. It's not a primary task of Matplotlib to teach people which plots to use for which applications (we also don't discuss plot() with line vs marker). We only need to tell people how to do a violin plot once they have chosen to use one. Therefore, I'd have a slight preference for removing the example alltogether. The second best option would be a short parapgraph describing the difference in our own words: violin more detail: good because more information. OTOH boxplot shows aggregated statistical quantities, and are thus simpler - showing less information can be better if that information is sufficient to you. Seeing a median and a spread in boxplots is easier than in a violin plot.
I think it would be good to retain a simple violin plot example in some form. We have Violin Plot Basics which actually seems to be many ways to customise the violin plot and also separately Violin Plot Customization which is a different sort of customisation but that isn't obvious from the title. I note both of those also have the scikit-learn link.
I ended up here because I was looking for a simple example to point a collaborator to and say "this might be a good way to visualise that data".
The second best option would be a short parapgraph describing the difference in our own words: violin more detail: good because more information. OTOH boxplot shows aggregated statistical quantities, and are thus simpler - showing less information can be better if that information is sufficient to you. Seeing a median and a spread in boxplots is easier than in a violin plot.
What about for violin taking an approach like boxplot and having a small breakdown of what's being computed/how - like nothing in the docs mentions that the violin is the rotated kde. I think providing that info let's folks make the decision about which is better for their use case rather than us recommending one vs another.
What about for violin taking an approach like boxplot and having a small breakdown of what's being computed/how
This is my first bullet point:
- We are generally missing an explanation of violin plots. A minimal explanation should be includeded in https://matplotlib.org/devdocs/api/_as_gen/matplotlib.axes.Axes.violinplot.html. That may be as small as linking to https://en.wikipedia.org/wiki/Violin_plot.