Spurious lines on violin plot
Thanks for contacting us! Please read and follow these instructions carefully, then delete this introductory text to keep your issue easy to read. Note that the issue tracker is NOT the place for usage questions and technical assistance; post those at Discourse instead. Issues without the required information below may be closed immediately.
ALL software version info
python 3.11.8 hab00c5b_0_cpython conda-forge
hvplot 0.9.2 pyhd8ed1ab_0 conda-forge
holoviews 1.18.3 pyhd8ed1ab_0 conda-forge
bokeh 3.3.4 pyhd8ed1ab_0 conda-forge
Description of expected behavior and the observed behavior
The violin plot occasionally produces spurious lines. The same data plotted with box is OK.
Complete, minimal, self-contained example code that reproduces the issue
Data file: bar.csv
This is a stripped down large data file, with the minimal number of items I was able to reproduce this behaviour.
# code goes here between backticks
import pandas as pd
import hvplot.pandas
df = pd.read_csv('/tmp/bar.csv')
df.hvplot.violin(by='month')
df.hvplot.box(by='month')
Stack traceback and/or browser JavaScript console output
Screenshots or screencasts of the bug in action
- [ ] I may be interested in making a pull request to address this
This seems to happen because Jun only has 0 as the a-value.
I doubt this is caused by only 0 data values. Here is a plot with full dataset.
There are several months with no ice, but no spurious lines show up. Also, the plot in my previous message looks fine when done with the
matplotlib extension. Which suggest the bug might be in holoviews/plotting/bokeh/stats.py
You are right. If I reduce your data to the following, I get the line.
import pandas as pd
import hvplot.pandas
import holoviews as hv
from io import StringIO
data = """
time,A,month
2009-06-01,0.0,Jun
2009-07-01,0.88,Jul
2009-07-02,0.96,Jul
2009-07-03,1.0,Jul
2009-07-04,0.95,Jul
2009-07-05,0.93,Jul
2008-07-01,0.47,Jul
2008-07-02,0.94,Jul
2008-07-03,0.93,Jul
2008-07-04,0.89,Jul
2008-07-05,0.95,Jul
2008-07-06,0.94,Jul
2008-07-07,0.91,Jul
2008-07-08,0.96,Jul
2008-07-09,0.90,Jul
2008-07-10,0.96,Jul
2008-07-11,0.9,Jul
2008-07-12,0.96,Jul
2008-07-13,0.94,Jul
2008-07-14,0.9,Jul
2008-07-15,0.95,Jul
2008-07-16,0.94,Jul
2008-07-17,0.89,Jul
2008-07-18,0.89,Jul
2008-07-19,0.87,Jul
2008-07-20,0.9,Jul
2008-07-21,0.90,Jul
2008-07-23,0.90,Jul
2008-07-24,0.93,Jul
2008-07-25,0.96,Jul
2008-07-26,1.0,Jul
2008-07-27,0.94,Jul
2008-07-28,0.88,Jul
2008-07-29,0.95,Jul
2008-07-31,0.98,Jul
"""
sio = StringIO(data.strip())
sio.seek(0)
df = pd.read_csv(sio)
df.hvplot.violin(by="month")
But if I remove one line from Jul, the line will not show up... I have absolutely no idea why this is happening...
plots = []
lines = data.strip().split("\n")
for i in range(2, len(lines)):
d = "\n".join([l for idx, l in enumerate(lines) if i != idx])
sio = StringIO(d)
sio.seek(0)
df = pd.read_csv(sio)
plots.append(df.hvplot.violin(by="month", label=lines[i]))
hv.Layout(plots).cols(2).opts(shared_axes=False)
This is a WebGL problem. Can you try disabling hv.renderer("bokeh").webgl = False.
Thanks, that worked.