holoviews
holoviews copied to clipboard
Show/hide individual categories in datashaded (count_cat) categorical data?
Is there a way to show/hide individual categories in datashaded (count_cat) categorical data, similar to what Bokeh interactive legend does? I'm currently emulating this behavior with ipywidgets and a Pipe that filters out data of unwanted categories before datashading. This almost works except the coloring will change as data come and go. For example:
N1 = 10000
N2 = 2000
x = np.concatenate([np.random.normal(-5, 1, N2),
np.random.normal(0, 0.5, N1),
np.random.normal(5, 1, N2)])
y = np.flip(x, axis=0)
z = np.concatenate([np.full(N2, 'blue'),
np.full(N1, 'red'),
np.full(N2, 'blue')])
color_key = {'red': '#e41a1c', 'blue': '#377eb8'}
points = hv.Points((x,y,z), kdims=['x', 'y'], vdims='z')
Show both blue and red categories:
hd.dynspread(hd.datashade(points, aggregator=ds.count_cat('z'), color_key=color_key))
Now just show the blue category:
hd.dynspread(hd.datashade(points.select(z='blue'), aggregator=ds.count_cat('z'), color_key=color_key))
Notice the blue points appear more 'dense' in the second picture as removing the red data points throws off the scale for normalization.
As far as I'm aware there is no way of using Bokeh's interactive legend with datashader, regardless of whether you use holoviews or not. I may be wrong though so if there is such an example, I would like to see it and support it in holoviews!
Bokeh does not support legend entries for images (last I checked neither does matplotlib).
Just to be clear, I do think this is a valid feature request and would be a very nice thing to have. It is just that I just don't think it is technically feasible right now without substantial effort to update both bokeh and holoviews. Do you agree with that assessment @jbednar ?
It would be conceivable to extend datashader's shade()
function (and the underlying _colorize()
function) to allow a category mask to be passed in so that the same normalization is done in shade() (respecting all categories) but some categories are not actually rendered. But then actually controlling it from Bokeh and/or HoloViews would have to be set up specially.
Alternatively, @jlstevens, I think this comes under the category of things we discussed that would work more uniformly if rasterize()
were used here instead of datashade()
, and then colors were computed from within Bokeh. If done that way, it seems like Bokeh's legends could be connected up in a very natural way, along with Bokeh's colorbars, as previously discussed for shading in general.
@jbednar, is it currently possible to show a a legend entry for rasterize
d hv.Curve
s?
You can use the "fake legend" trick from http://holoviews.org/user_guide/Large_Data.html#multidimensional-plots , using invisible points or very short lines.
Thanks for the tip. Somehow, I managed to miss it, although I've been looking all over the place for solutions, despite that user guide being the first place I found describing rasterize
and decimate
.
Ok, now, to provide more context: What I'm actually interested in is also hiding individual curves. E.g., exactly in the user guide section you mentioned, the line overlay example suffers from a lot of clutter. My current use case is that, in the same plot with multiple curves, the user wants to see all of them on the same plot (cluttered), with the same axis ranges, and then select pairs of curves which he then wants to inspect into more detail.
Unknowingly (although I think I have stumbled upon this trick somewhere else, too), I'm actually employing it, and here's why: the user also needs to see the exact value for a certain point, and that's why I'm also overlaying a decimate
d version of the Points
underlying the Curve
. I need both because, when decimating, some peaks the user cares about are not apparent any more (they don't make it through the subsampling). So I do have some legend entries for some of the points on the curve, but I can't use them to hide the curves themselves.
Now, it's possible to implement a different subsampling function that preserves the important peaks, but that is not as easy as it sounds, because the signals are not stationary, so they tend to have different local frequencies, and most peak detection algorithms use a fixed sized window of some sort.
The quick and easy (and, hopefully, temporary) solution I found was to overlay both the rasterized curve and the decimated points, thus allowing for manual (actually, visual) inspection before fully automating the peak detection (which might take quite a bit of time and effort).
Now that I've written this lengthy comment, I realize that maybe it belongs on some forum more than here, but, in my defense, I posted questions about this both on stack overflow and on the holoviz discourse page, but never got an answer, and thought that perhaps here there's a chance to understand what I'm doing wrong.