bokeh icon indicating copy to clipboard operation
bokeh copied to clipboard

multi_line broken when passing numpy arrays

Open poke1024 opened this issue 5 years ago • 3 comments

bokeh version 1.4.0 python 3.7.0 jupyter notebook 6.0.2 macOS 10.15.2

Description of expected behavior and the observed behavior

This should work:

import bokeh.plotting
fig = bokeh.plotting.figure()
fig.multi_line(numpy.array([[1, 3, 2], [3, 4, 6]]), numpy.array([[2, 1, 4], [4, 7, 8]]))

However, it doesn't (see error message below).

For the record, the following code, without using numpy arrays, works without errors:

import bokeh.plotting
fig = bokeh.plotting.figure()
fig.multi_line(([[1, 3, 2], [3, 4, 6]]), ([[2, 1, 4], [4, 7, 8]]))
bokeh.io.show(fig)

Complete, minimal, self-contained example code that reproduces the issue

see above.

Stack traceback and/or browser JavaScript console output

Error produced with example above:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-60-d6c8a8fa1428> in <module>
      1 import bokeh.plotting
      2 fig = bokeh.plotting.figure()
----> 3 fig.multi_line(numpy.array([[1, 3, 2], [3, 4, 6]]), numpy.array([[2, 1, 4], [4, 7, 8]]))

fakesource in multiline(self, xs, ys, **kwargs)

~/anaconda3/envs/nteract/lib/python3.7/site-packages/bokeh/plotting/helpers.py in func(self, **kwargs)
    890         glyph_ca = _pop_colors_and_alpha(glyphclass, kwargs)
    891         incompatible_literal_spec_values = []
--> 892         incompatible_literal_spec_values += _process_sequence_literals(glyphclass, kwargs, source, is_user_source)
    893         incompatible_literal_spec_values += _process_sequence_literals(glyphclass, glyph_ca, source, is_user_source)
    894         if incompatible_literal_spec_values:

~/anaconda3/envs/nteract/lib/python3.7/site-packages/bokeh/plotting/helpers.py in _process_sequence_literals(glyphclass, kwargs, source, is_user_source)
    373 
    374         if isinstance(val, np.ndarray) and val.ndim != 1:
--> 375             raise RuntimeError("Columns need to be 1D (%s is not)" % var)
    376 
    377         if is_user_source:

RuntimeError: Columns need to be 1D (xs is not)

Screenshots or screencasts of the bug in action

poke1024 avatar Jan 01 '20 11:01 poke1024

@poke1024 This is not a bug, as there is not currently any expectation that things should work with numpy arrays. The glyphs requires "lists of lists" or "lists of arrays" (because in general, the sub-lines have no expectation to be the same length). One way to adapt things would be to use a list comprehension to make a list of arrays:

In [11]: a = np.array([[1,2], [3,4]])

In [12]: lst = [x[i:] for i, x in enumerate(a)]

In [13]: lst
Out[13]: [array([1, 2]), array([4])]

It might make sense, as a convenience, to perform this adaptation automatically internally. It is a nice pure Python task that would be good for a new contributor. Would you be interested to work on the feature?

bryevdv avatar Jan 01 '20 18:01 bryevdv

Yes, why not.

Also I think it's good to support this, as the examples seem to imply some support of numpy like here: https://github.com/bokeh/bokeh/blob/c3e858b88a318ff4faa4c663b1f2c2cc53fec5ee/examples/plotting/file/lorenz.py#L32

I looked into this a bit. The whole problem is that the check in https://github.com/bokeh/bokeh/blob/c3e858b88a318ff4faa4c663b1f2c2cc53fec5ee/bokeh/plotting/helpers.py#L463 is not correct for MultiLine and MultiPolygon. For these cases, it needs to check against val.ndim != 2. With that fix, everything works fine.

There seem two obivous ways to fix this: (1) at location, by adding some hard-coded glyphclass in (MultiLine, MultiPolygon) check, or (2) by adding some new datadim property to https://github.com/bokeh/bokeh/blob/c3e858b88a318ff4faa4c663b1f2c2cc53fec5ee/bokeh/core/has_props.py#L232.

poke1024 avatar Jan 02 '20 11:01 poke1024

Also I think it's good to support this, as the examples seem to imply some support of numpy like here:

@poke1024 That example does what I suggested above: it provides a list of 1-d arrays:

In [15]: np.array_split(xprime, 7)
Out[15]:
[array([12.02081528, 11.56800288, 11.11943992, ..., 14.16587838,
        14.91413732, 15.68758183]),
...
 array([-17.12658567, -17.31135881, -17.47077918, ..., -22.26512893,
        -22.36096228, -22.37965827])]

(1) at location, by adding some hard-coded glyphclass in (MultiLine, MultiPolygon) check, or (2) by adding some new datadim property to

I think that is a reasonable place for a dim check. It slightly surprises me that that is sufficient. My concern would be that in the Bokeh server app case, things falls back to the very inefficient JSON encoding, and not the binary transport. I would expect to need to also need explicit logic to split the 2-d array into a list of 1-d arrays somewhere too. We will need to investigate this after there is an initial PR submitted to work with. I can certainly help with that.

In any case, rather than coupling that code to special case glyph class types with an instance checks, I would suggest your "datadim" approach. However, the place to put that is simply new class attributes on the appropriate glyphs. The datadim value should specify exactly which properties can accept n-d array values:

  • MulitLine, MultiPolygons and Patches all have xs and ys properties and these are the properties that could uses 2-d arrays

  • "Image" glyphs could potentially also accept 3-d arrays for the image property (their underlying type is a list of 2-d images*

So, e.g. maybe something like:

class MultiLine(...):

   _accept_dims = {'xs': 2, 'ys': 2}

The help code can check glyphcalss._accept_dims and then anything not otherwise specified is assume to accept just ndim=1 by default.

There needs to be a documented convention for how the "split" is done. is a 2d array turned into a list of columns? Or a list of rows? Evidently something is already happening, we need to make sure to document exactly what, and also make sure it is maintained under test. It's possible this could be made configurable as well

bryevdv avatar Jan 02 '20 17:01 bryevdv