holoviews icon indicating copy to clipboard operation
holoviews copied to clipboard

Show/hide individual categories in datashaded (count_cat) categorical data?

Open suyang-nju opened this issue 6 years ago • 7 comments

Is there a way to show/hide individual categories in datashaded (count_cat) categorical data, similar to what Bokeh interactive legend does? I'm currently emulating this behavior with ipywidgets and a Pipe that filters out data of unwanted categories before datashading. This almost works except the coloring will change as data come and go. For example:

N1 = 10000
N2 = 2000
x = np.concatenate([np.random.normal(-5, 1, N2), 
                    np.random.normal(0, 0.5, N1), 
                    np.random.normal(5, 1, N2)])
y = np.flip(x, axis=0)
z = np.concatenate([np.full(N2, 'blue'), 
                    np.full(N1, 'red'), 
                    np.full(N2, 'blue')])
color_key = {'red': '#e41a1c', 'blue': '#377eb8'}
points = hv.Points((x,y,z), kdims=['x', 'y'], vdims='z')

Show both blue and red categories:

hd.dynspread(hd.datashade(points, aggregator=ds.count_cat('z'), color_key=color_key))

1

Now just show the blue category:

hd.dynspread(hd.datashade(points.select(z='blue'), aggregator=ds.count_cat('z'), color_key=color_key))

2

Notice the blue points appear more 'dense' in the second picture as removing the red data points throws off the scale for normalization.

suyang-nju avatar May 20 '18 06:05 suyang-nju

As far as I'm aware there is no way of using Bokeh's interactive legend with datashader, regardless of whether you use holoviews or not. I may be wrong though so if there is such an example, I would like to see it and support it in holoviews!

jlstevens avatar Jun 06 '18 13:06 jlstevens

Bokeh does not support legend entries for images (last I checked neither does matplotlib).

philippjfr avatar Jun 06 '18 13:06 philippjfr

Just to be clear, I do think this is a valid feature request and would be a very nice thing to have. It is just that I just don't think it is technically feasible right now without substantial effort to update both bokeh and holoviews. Do you agree with that assessment @jbednar ?

jlstevens avatar Jun 06 '18 13:06 jlstevens

It would be conceivable to extend datashader's shade() function (and the underlying _colorize() function) to allow a category mask to be passed in so that the same normalization is done in shade() (respecting all categories) but some categories are not actually rendered. But then actually controlling it from Bokeh and/or HoloViews would have to be set up specially.

Alternatively, @jlstevens, I think this comes under the category of things we discussed that would work more uniformly if rasterize() were used here instead of datashade(), and then colors were computed from within Bokeh. If done that way, it seems like Bokeh's legends could be connected up in a very natural way, along with Bokeh's colorbars, as previously discussed for shading in general.

jbednar avatar Jun 06 '18 20:06 jbednar

@jbednar, is it currently possible to show a a legend entry for rasterized hv.Curves?

bbudescu avatar Jul 20 '22 15:07 bbudescu

You can use the "fake legend" trick from http://holoviews.org/user_guide/Large_Data.html#multidimensional-plots , using invisible points or very short lines.

jbednar avatar Jul 21 '22 01:07 jbednar

Thanks for the tip. Somehow, I managed to miss it, although I've been looking all over the place for solutions, despite that user guide being the first place I found describing rasterize and decimate.

Ok, now, to provide more context: What I'm actually interested in is also hiding individual curves. E.g., exactly in the user guide section you mentioned, the line overlay example suffers from a lot of clutter. My current use case is that, in the same plot with multiple curves, the user wants to see all of them on the same plot (cluttered), with the same axis ranges, and then select pairs of curves which he then wants to inspect into more detail.

Unknowingly (although I think I have stumbled upon this trick somewhere else, too), I'm actually employing it, and here's why: the user also needs to see the exact value for a certain point, and that's why I'm also overlaying a decimated version of the Points underlying the Curve. I need both because, when decimating, some peaks the user cares about are not apparent any more (they don't make it through the subsampling). So I do have some legend entries for some of the points on the curve, but I can't use them to hide the curves themselves.

Now, it's possible to implement a different subsampling function that preserves the important peaks, but that is not as easy as it sounds, because the signals are not stationary, so they tend to have different local frequencies, and most peak detection algorithms use a fixed sized window of some sort.

The quick and easy (and, hopefully, temporary) solution I found was to overlay both the rasterized curve and the decimated points, thus allowing for manual (actually, visual) inspection before fully automating the peak detection (which might take quite a bit of time and effort).

Now that I've written this lengthy comment, I realize that maybe it belongs on some forum more than here, but, in my defense, I posted questions about this both on stack overflow and on the holoviz discourse page, but never got an answer, and thought that perhaps here there's a chance to understand what I'm doing wrong.

bbudescu avatar Jul 21 '22 08:07 bbudescu