holoviews icon indicating copy to clipboard operation
holoviews copied to clipboard

Datashader can't be used to plot multiple columns from a Dataframe

Open pepijndevos opened this issue 3 years ago • 6 comments

ALL software version info

version info
$ pip freeze
aiocouch==2.2.1
aiohttp==3.8.1
aiosignal==1.2.0
anyio==3.5.0
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
async-timeout==4.0.2
attrs==21.4.0
Babel==2.9.1
backcall==0.2.0
bleach==4.1.0
bokeh==2.4.2
certifi==2021.10.8
cffi==1.15.0
chardet==4.0.0
charset-normalizer==2.0.10
chevron==0.14.0
click==8.0.4
cloudpickle==2.0.0
colorcet==3.0.0
dask==2022.3.0
datashader==0.13.0
datashape==0.5.2
debugpy==1.5.1
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.13
distributed==2022.3.0
docker==5.0.3
entrypoints==0.3
escapism==1.0.1
frozenlist==1.3.0
fsspec==2022.2.0
HeapDict==1.0.1
holoviews==1.14.8
hvplot==0.7.3
ibm-cloud-sdk-core==3.13.2
ibmcloudant==0.0.41
idna==3.3
ipykernel==6.6.1
ipython==7.31.0
ipython-genutils==0.2.0
ipywidgets==7.6.5
iso8601==1.0.2
jedi==0.18.1
Jinja2==3.0.3
json5==0.9.6
jsonschema==4.4.0
jupyter-bokeh==3.0.4
jupyter-client==7.1.0
jupyter-core==4.9.1
jupyter-repo2docker==2021.8.0
jupyter-server==1.13.2
-e git+https://github.com/jupyterhub/jupyter-server-proxy.git@26d7569a7d2e7c806fcafe3b538ecbe061e7c0d8#egg=jupyter_server_proxy
jupyterlab==3.2.7
jupyterlab-pygments==0.1.2
jupyterlab-server==2.10.3
jupyterlab-widgets==1.0.2
llvmlite==0.38.0
locket==0.2.1
Markdown==3.3.6
MarkupSafe==2.0.1
matplotlib-inline==0.1.3
mistune==0.8.4
more-itertools==8.12.0
msgpack==1.0.3
multidict==6.0.2
multipledispatch==0.6.0
nbclassic==0.3.5
nbclient==0.5.9
nbconvert==6.4.0
nbformat==5.1.3
nest-asyncio==1.5.4
notebook==6.4.7
numba==0.55.1
numpy==1.21.5
packaging==21.3
pandas==1.3.5
pandocfilters==1.5.0
panel==0.12.6
param==1.12.0
parso==0.8.3
partd==1.2.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.0.0
plotly==5.6.0
prometheus-client==0.12.0
prompt-toolkit==3.0.24
psutil==5.9.0
ptyprocess==0.7.0
pycapnp==1.1.0
pycparser==2.21
pyct==0.4.8
Pygments==2.11.2
PyJWT==2.3.0
pyparsing==3.0.6
pyrsistent==0.18.0
python-dateutil==2.8.2
python-json-logger==2.0.2
-e git+ssh://[email protected]/NyanCAD/Pyttoresque.git@839f012e6b4061e378825aa8e039191a821c97ef#egg=Pyttoresque
pytz==2021.3
pyviz-comms==2.1.0
PyYAML==6.0
pyzmq==22.3.0
requests==2.27.1
ruamel.yaml==0.17.20
ruamel.yaml.clib==0.2.6
scipy==1.8.0
semver==2.13.0
Send2Trash==1.8.0
simpervisor==0.4
six==1.16.0
sniffio==1.2.0
sortedcontainers==2.4.0
streamz==0.6.3
tblib==1.7.0
tenacity==8.0.1
terminado==0.12.1
testpath==0.5.0
toml==0.10.2
toolz==0.11.2
tornado==6.1
tqdm==4.63.0
traitlets==5.1.1
typing_extensions==4.0.1
urllib3==1.26.8
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.2.3
widgetsnbextension==3.5.2
wrapt==1.14.0
xarray==2022.3.0
yarl==1.7.2
zict==2.1.0

Description of expected behavior and the observed behavior

When trying to stream a dataframe into a buffer, and plotting some columns with Datashader, a keyerror is generated. The same works fine without datashader.

A workaround is to extract a tuple of x/y data, and give the axis the same name for all plots.

Complete, minimal, self-contained example code that reproduces the issue

Setup

import holoviews as hv
import datashader as ds
import pandas as pd
import numpy as np
from holoviews.streams import Buffer, Stream, param
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize

hv.extension('plotly')

active_traces = Stream.define('traces', cols=[])

def _timeplot(data, cols=[]):
    # traces = {k: hv.Curve((data.index, data[k]), 'time', 'amplitude') for k in cols}
    traces = {k: hv.Curve(data, 'index', k) for k in cols}
    return hv.NdOverlay(traces, kdims='k')

def timeplot(streams):
    curve_dmap = hv.DynamicMap(_timeplot, streams=streams)
    return spread(datashade(curve_dmap, aggregator=ds.count_cat('k'), streams=[hv.streams.PlotSize]))

def nods_plot(streams):
    return hv.DynamicMap(_timeplot, streams=streams)

n=100
m=1000
def stream(streamdict):
    for i in range(n):
        res = {"stuff": pd.DataFrame({'foo': np.random.rand(m), 'bar': np.random.rand(m)+0.8, 'baz': np.random.rand(m)-0.8}, index=np.arange(m)+m*i)}
        for k, v in res.items():
            #print(list(v.columns), list(streamdict[k].data.columns))
            if k in streamdict and list(v.columns) == list(streamdict[k].data.columns):
                streamdict[k].send(v)
            else:
                buf = Buffer(v, length=int(1e9), index=False)
                streamdict[k] = buf
        yield

cols = active_traces(cols=['foo', 'bar', 'baz'])
d = {}
it = stream(d)
next(it)

Now render the plot without datashader

nods_plot([d['stuff'], cols])

and try the same with datashader

timeplot([d['stuff'], cols])

Stack traceback and/or browser JavaScript console output

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/code/asic/pyttoresque/env/lib/python3.10/site-packages/IPython/core/formatters.py in __call__(self, obj, include, exclude)
    968 
    969             if method is not None:
--> 970                 return method(include=include, exclude=exclude)
    971             return None
    972         else:

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/dimension.py in _repr_mimebundle_(self, include, exclude)
   1314         combined and returned.
   1315         """
-> 1316         return Store.render(self)
   1317 
   1318 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/options.py in render(cls, obj)
   1403         data, metadata = {}, {}
   1404         for hook in hooks:
-> 1405             ret = hook(obj)
   1406             if ret is None:
   1407                 continue

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/ipython/display_hooks.py in pprint_display(obj)
    280     if not ip.display_formatter.formatters['text/plain'].pprint:
    281         return None
--> 282     return display(obj, raw_output=True)
    283 
    284 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/ipython/display_hooks.py in display(obj, raw_output, **kwargs)
    256     elif isinstance(obj, (HoloMap, DynamicMap)):
    257         with option_state(obj):
--> 258             output = map_display(obj)
    259     elif isinstance(obj, Plot):
    260         output = render(obj)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/ipython/display_hooks.py in wrapped(element)
    144         try:
    145             max_frames = OutputSettings.options['max_frames']
--> 146             mimebundle = fn(element, max_frames=max_frames)
    147             if mimebundle is None:
    148                 return {}, {}

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/ipython/display_hooks.py in map_display(vmap, max_frames)
    204         return None
    205 
--> 206     return render(vmap)
    207 
    208 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/ipython/display_hooks.py in render(obj, **kwargs)
     66         renderer = renderer.instance(fig='png')
     67 
---> 68     return renderer.components(obj, **kwargs)
     69 
     70 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/plotting/renderer.py in components(self, obj, fmt, comm, **kwargs)
    408                 doc = Document()
    409                 with config.set(embed=embed):
--> 410                     model = plot.layout._render_model(doc, comm)
    411                 if embed:
    412                     return render_model(model, comm)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/viewable.py in _render_model(self, doc, comm)
    453         if comm is None:
    454             comm = state._comm_manager.get_server_comm()
--> 455         model = self.get_root(doc, comm)
    456 
    457         if config.embed:

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/viewable.py in get_root(self, doc, comm, preprocess)
    510         """
    511         doc = init_doc(doc)
--> 512         root = self._get_model(doc, comm=comm)
    513         if preprocess:
    514             self._preprocess(root)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/layout/base.py in _get_model(self, doc, root, parent, comm)
    120         if root is None:
    121             root = model
--> 122         objects = self._get_objects(model, [], doc, root, comm)
    123         props = dict(self._init_params(), objects=objects)
    124         model.update(**self._process_param_change(props))

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/layout/base.py in _get_objects(self, model, old_objects, doc, root, comm)
    110             else:
    111                 try:
--> 112                     child = pane._get_model(doc, root, model, comm)
    113                 except RerenderError:
    114                     return self._get_objects(model, current_objects[:i], doc, root, comm)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/pane/holoviews.py in _get_model(self, doc, root, parent, comm)
    237             plot = self.object
    238         else:
--> 239             plot = self._render(doc, comm, root)
    240 
    241         plot.pane = self

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/panel/pane/holoviews.py in _render(self, doc, comm, root)
    310                 kwargs['comm'] = comm
    311 
--> 312         return renderer.get_plot(self.object, **kwargs)
    313 
    314     def _cleanup(self, root):

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/plotting/renderer.py in get_plot(self_or_cls, obj, doc, renderer, comm, **kwargs)
    218 
    219         # Initialize DynamicMaps with first data item
--> 220         initialize_dynamic(obj)
    221 
    222         if not renderer:

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/plotting/util.py in initialize_dynamic(obj)
    252             continue
    253         if not len(dmap):
--> 254             dmap[dmap._initial_key()]
    255 
    256 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in __getitem__(self, key)
   1342         # Not a cross product and nothing cached so compute element.
   1343         if cache is not None: return cache
-> 1344         val = self._execute_callback(*tuple_key)
   1345         if data_slice:
   1346             val = self._dataslice(val, data_slice)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in _execute_callback(self, *args)
   1109 
   1110         with dynamicmap_memoization(self.callback, self.streams):
-> 1111             retval = self.callback(*args, **kwargs)
   1112         return self._style(retval)
   1113 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in __call__(self, *args, **kwargs)
    675         kwarg_hash = kwargs.pop('_memoization_hash_', ())
    676         (self.args, self.kwargs) = (args, kwargs)
--> 677         if not args and not kwargs and not any(kwarg_hash): return self.callable()
    678         inputs = [i for i in self.inputs if isinstance(i, DynamicMap)]
    679         streams = []

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/util/__init__.py in dynamic_operation(*key, **kwargs)
   1041 
   1042         def dynamic_operation(*key, **kwargs):
-> 1043             key, obj = resolve(key, kwargs)
   1044             return apply(obj, *key, **kwargs)
   1045 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/util/__init__.py in resolve(key, kwargs)
   1030             elif isinstance(map_obj, DynamicMap) and map_obj._posarg_keys and not key:
   1031                 key = tuple(kwargs[k] for k in map_obj._posarg_keys)
-> 1032             return key, map_obj[key]
   1033 
   1034         def apply(element, *key, **kwargs):

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in __getitem__(self, key)
   1342         # Not a cross product and nothing cached so compute element.
   1343         if cache is not None: return cache
-> 1344         val = self._execute_callback(*tuple_key)
   1345         if data_slice:
   1346             val = self._dataslice(val, data_slice)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in _execute_callback(self, *args)
   1109 
   1110         with dynamicmap_memoization(self.callback, self.streams):
-> 1111             retval = self.callback(*args, **kwargs)
   1112         return self._style(retval)
   1113 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/spaces.py in __call__(self, *args, **kwargs)
    706 
    707         try:
--> 708             ret = self.callable(*args, **kwargs)
    709         except KeyError:
    710             # KeyError is caught separately because it is used to signal

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/util/__init__.py in dynamic_operation(*key, **kwargs)
   1042         def dynamic_operation(*key, **kwargs):
   1043             key, obj = resolve(key, kwargs)
-> 1044             return apply(obj, *key, **kwargs)
   1045 
   1046         operation = self.p.operation

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/util/__init__.py in apply(element, *key, **kwargs)
   1034         def apply(element, *key, **kwargs):
   1035             kwargs = dict(util.resolve_dependent_kwargs(self.p.kwargs), **kwargs)
-> 1036             processed = self._process(element, key, kwargs)
   1037             if (self.p.link_dataset and isinstance(element, Dataset) and
   1038                 isinstance(processed, Dataset) and processed._dataset is None):

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/util/__init__.py in _process(self, element, key, kwargs)
   1016         elif isinstance(self.p.operation, Operation):
   1017             kwargs = {k: v for k, v in kwargs.items() if k in self.p.operation.param}
-> 1018             return self.p.operation.process_element(element, key, **kwargs)
   1019         else:
   1020             return self.p.operation(element, **kwargs)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in process_element(self, element, key, **params)
    192             self.p = param.ParamOverrides(self, params,
    193                                           allow_extra_keywords=self._allow_extra_keywords)
--> 194         return self._apply(element, key)
    195 
    196 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in _apply(self, element, key)
    139             if not in_method:
    140                 element._in_method = True
--> 141         ret = self._process(element, key)
    142         if hasattr(element, '_in_method') and not in_method:
    143             element._in_method = in_method

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/operation/datashader.py in _process(self, element, key)
   1533 
   1534     def _process(self, element, key=None):
-> 1535         agg = rasterize._process(self, element, key)
   1536         shaded = shade._process(self, agg, key)
   1537         return shaded

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/operation/datashader.py in _process(self, element, key)
   1512                                        if k in transform.param})
   1513             op._precomputed = self._precomputed
-> 1514             element = element.map(op, predicate)
   1515             self._precomputed = op._precomputed
   1516 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/dimension.py in map(self, map_fn, specs, clone)
    705                 if new_val is not None:
    706                     deep_mapped[k] = new_val
--> 707             if applies: deep_mapped = map_fn(deep_mapped)
    708             return deep_mapped
    709         else:

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in __call__(self, element, **kwargs)
    212             elif ((self._per_element and isinstance(element, Element)) or
    213                   (not self._per_element and isinstance(element, ViewableElement))):
--> 214                 return self._apply(element)
    215         elif 'streams' not in kwargs:
    216             kwargs['streams'] = self.p.streams

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in _apply(self, element, key)
    139             if not in_method:
    140                 element._in_method = True
--> 141         ret = self._process(element, key)
    142         if hasattr(element, '_in_method') and not in_method:
    143             element._in_method = in_method

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/operation/datashader.py in _process(self, element, key)
    447                 dynamic=False, **{p: v for p, v in self.p.items()
    448                                   if p not in ('name', 'dynamic')})
--> 449             return overlay_aggregate(element, **params)
    450 
    451         if element._plot_id in self._precomputed:

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/param/parameterized.py in __new__(class_, *args, **params)
   3629         inst = class_.instance()
   3630         inst.param._set_name(class_.__name__)
-> 3631         return inst.__call__(*args,**params)
   3632 
   3633     def __call__(self,*args,**kw):

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in __call__(self, element, **kwargs)
    212             elif ((self._per_element and isinstance(element, Element)) or
    213                   (not self._per_element and isinstance(element, ViewableElement))):
--> 214                 return self._apply(element)
    215         elif 'streams' not in kwargs:
    216             kwargs['streams'] = self.p.streams

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/operation.py in _apply(self, element, key)
    139             if not in_method:
    140                 element._in_method = True
--> 141         ret = self._process(element, key)
    142         if hasattr(element, '_in_method') and not in_method:
    143             element._in_method = in_method

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/operation/datashader.py in _process(self, element, key)
    537             x, y = dims
    538 
--> 539         info = self._get_sampling(element, x, y, ndims)
    540         (x_range, y_range), (xs, ys), (width, height), (xtype, ytype) = info
    541         ((x0, x1), (y0, y1)), _ = self._dt_transform(x_range, y_range, xs, ys, xtype, ytype)

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/operation/datashader.py in _get_sampling(self, element, x, y, ndim, default)
    184         elif not np.isfinite(ystart) and not np.isfinite(yend):
    185             ystart, yend = 0, 0
--> 186             if y and element.get_dimension_type(y[0]) in datetime_types:
    187                 ytype = 'datetime'
    188 

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/dimension.py in get_dimension_type(self, dim)
   1031         if dim_obj and dim_obj.type is not None:
   1032             return dim_obj.type
-> 1033         dim_vals = [type(v) for v in self.dimension_values(dim)]
   1034         if len(set(dim_vals)) == 1:
   1035             return dim_vals[0]

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/ndmapping.py in dimension_values(self, dimension, expanded, flat)
    392             NumPy array of values along the requested dimension
    393         """
--> 394         dimension = self.get_dimension(dimension, strict=True)
    395         if dimension in self.kdims:
    396             return np.array([k[self.get_dimension_index(dimension)] for k in self.data.keys()])

~/code/asic/pyttoresque/env/lib/python3.10/site-packages/holoviews/core/dimension.py in get_dimension(self, dimension, default, strict)
    975             dims = [d for d in all_dims if dimension == d]
    976             if strict and not dims:
--> 977                 raise KeyError("%r not found." % dimension)
    978             elif dims:
    979                 return dims[0]

KeyError: "Dimension('foo') not found."

:DynamicMap   []

Screenshots or screencasts of the bug in action

image

pepijndevos avatar Apr 04 '22 15:04 pepijndevos

@jlstevens , I'm not sure how to handle this at the hv level. Datashading an overlay is handled already, but here it seems like we need similar machinery for when the data is a stream?

jbednar avatar Apr 04 '22 21:04 jbednar

Upon closer inspection it seems unrelated to streams

import holoviews as hv
import datashader as ds
import pandas as pd
import numpy as np
from holoviews.streams import Buffer, Stream, param
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize

hv.extension('plotly')

n=100
m=1000
i = 0
df = pd.DataFrame({'foo': np.random.rand(m), 'bar': np.random.rand(m)+0.8, 'baz': np.random.rand(m)-0.8}, index=np.arange(m)+m*i)


traces = {k: hv.Curve(df, 'index', k) for k in df.columns}
curves = hv.NdOverlay(traces, kdims='k')

spread(datashade(curves, aggregator=ds.count_cat('k')))

pepijndevos avatar Apr 06 '22 10:04 pepijndevos

The problem is probably related to incoherent vdims of the curves in the NdOverlay. If you print(curves) You obtain this: :NdOverlay [k] :Curve [index] (foo) If you print(curves['bar'] you get the expected :Curve [index] (bar) though.

To build from your example, you can rather define traces as traces = {k: hv.Curve(df, 'index', k).redim(**{k:'value'}) for k in df.columns} so print(curves) now gives :NdOverlay [k] :Curve [index] (value). And you can now datashade curves.

Another way to think about it is to use an xarray.Dataset data structure that seems more appropriate in that case. It may look complicated if you are not familiar with xarray but this mental model really worth it. Besides, you can then stick to hvplot and activate datashade with an option.

import xarray as xr
import hvplot.xarray

df_multi_index = pd.DataFrame(df.stack(),columns=['value'])
df_multi_index.index.names = ['index','k']
ds = xr.Dataset.from_dataframe(df_multi_index)
curves = ds.hvplot('index','value').overlay()

marcbernot avatar Apr 09 '22 09:04 marcbernot

So is this "working as expected" then? Because without datashader it does plot both traces, though admittedly, picking a column name as the axis label.

The redim solution seems nicer than my tuple hack regardless. Not sure about the xarray solution because I do need streaming in my app.

pepijndevos avatar Apr 09 '22 14:04 pepijndevos

I would not say it completely works as expected since your experience was suboptimal and confusing. In your code, you try to set both the vdims of the Curve element and the "values" of the kdim of the NdOverlay as being the names of the columns of the dataframe. This could be inconsistent if for example the columns of your dataframe represent data with different units ; but if it is temperature1, temperature2 and temperature3 then the right vdim would be temperature and the kdim would be 'probe' with values 1,2 and 3.

One strength of Holoviews is to infer the most it can with just the relevant annotation of your data. Maybe it should issue a warning when vdims are clashing like in your example. Maybe it could even infer that what you want is a new common vdim, by betting that you know what you are doing with this NdOverlay that reallocates and empties the dimensionality of your data.

marcbernot avatar Apr 11 '22 17:04 marcbernot

One thing I tried is pasing the vdim as a tuple. ("foo", "magnitude") This allows you to specify the key and the label separately but apparently still uses the key as the domain rather than the label.

pepijndevos avatar Apr 11 '22 17:04 pepijndevos