plotly.js icon indicating copy to clipboard operation
plotly.js copied to clipboard

Accept objects for encoded typedarrays in data_array valType

Open archmoj opened this issue 5 years ago • 15 comments
trafficstars

Architecture Details

The approach is to perform the typed array decoding in coerce (before the supplyDefaults logic in the layout and traces). This has the advantage that none of the supply defaults or calc logic needs to be aware of the typed array specification objects.

For performance, the ArrayBuffer instances resulting from the decoding process are cached by trace uid and property name. To keep cache bounded, only one ArrayBuffer is cached per trace per property. Playing with the isosurface example in https://github.com/plotly/plotly.js/pull/5230, building the typed array with this caching approach is about 10x faster than when performing the base64 decoding.

Implementation Details

This PR handles decoding N-dimensional arrays into a nested collection of Arrays. The inner most layer contains TypedArrays that are backed by the original decoded ArrayBuffer, so no copying is performed. So far, all the 2D traces I've played with have worked fine with this nesting arrangement.

More testing is needed, but hopefully nothing in supplyDefaults or calc or rendering will need to change in any of the traces (apart from potential bug fixes for TypedArray handling).

Latest demo

codepen

Previous attempt

The initial commit https://github.com/plotly/plotly.js/pull/5230/commits/dc9f4e59778f23721437b943688fcb08f83b6b42 was inspired by @jonmmease's PR #2911.

@plotly/plotly_js

TODO:

  • [ ] add encoding attribute, currently should be set to "base64". Brought to mind by question of alternative encodings in https://github.com/plotly/plotly.py/pull/2943#issuecomment-735790062. encoding of "arraybuffer" when the bvals already holds an array buffer.
  • [ ] investigate what we should do if a TypedArray is passed to data types other than data_array and arrayOk. e.g. info_array. Maybe we could just convert these to regular arrays in supply defaults. (https://github.com/plotly/plotly.py/pull/2943#issuecomment-735908686)

archmoj avatar Oct 22 '20 20:10 archmoj

@jonmmease are you interested to review & possibly work on this PR?

archmoj avatar Nov 02 '20 14:11 archmoj

I'd love for this functionality to land, in particular to communicate images and volumetric data more efficiently. In addition to being more memory efficient, it would also improve the speed of the JSON encoding, as we found during our work on improving the performance of the JSON encoder in #2880.

I'm happy to get my hands dirty to help move this effort forwards. E.g. by implementing the encoding step for the Python JSON encoder. Though that should come after this. In the mean time, I could perhaps help implement support for other traces?

almarklein avatar Nov 18 '20 10:11 almarklein

I just did some benchmarking on the encoding of ndarray data in Python. Perhaps this helps create some enthusiasm for this feature :)

I used a test array of 1000x1000 containing random data. I compared encoding the data in the form proposed in this PR against letting the PlotlyJSONEncoder converting it to lists. The timing includes the time of the base64 encoding.

  • The size is 28% (more than 3x smaller).
  • The speed is about 10x faster (from ~1 sec to ~0.1s for this data size).

(this is measured with the improved PlotlyJSONEncoder of #2880, before that, the speed difference was even a factor 25).

almarklein avatar Nov 18 '20 10:11 almarklein

@jonmmease are you interested to review & possibly work on this PR?

Yeah, I'm interested in picking this up again. Probably early December.

@almarklein Thanks for chiming in and for trying out those mini benchmarks!

One not yet settled question, as far as I understand, is what multi-dimensional array buffers are going to be de-serialized into in JavaScript (since TypedArrays are 1D only). @archmoj, I believe we already have a dependnecy on ndarray(https://github.com/scijs/ndarray) through the WebGL stuff. Do you think we could make plotly.js accept these as input for 2D/3D arrays (heatmap, volume, etc.)?

jonmmease avatar Nov 18 '20 10:11 jonmmease

@jonmmease are you interested to review & possibly work on this PR?

Yeah, I'm interested in picking this up again. Probably early December.

@almarklein Thanks for chiming in and for trying out those mini benchmarks!

One not yet settled question, as far as I understand, is what multi-dimensional array buffers are going to be de-serialized into in JavaScript (since TypedArrays are 1D only). @archmoj, I believe we already have a dependnecy on ndarray(https://github.com/scijs/ndarray) through the WebGL stuff. Do you think we could make plotly.js accept these as input for 2D/3D arrays (heatmap, volume, etc.)?

Right now one could pass 2D/3D arrays to volume, surface, etc. using 'shape' as mentioned in the modified attribute description.

archmoj avatar Nov 18 '20 12:11 archmoj

Unlike #2911, the decoding process is done at calc step instead of supplyDefaults: 636e644

@archmoj Could you explain your motivation here? If I remember correctly, it seemed easier to me to do this in the top-level supplyDefaults so that all of the individual traces wouldn't have to be updated to deal with the base64 object.

Thanks again for resurrecting this!

jonmmease avatar Nov 28 '20 14:11 jonmmease

Unlike #2911, the decoding process is done at calc step instead of supplyDefaults: 636e644

@archmoj Could you explain your motivation here? If I remember correctly, it seemed easier to me to do this in the top-level supplyDefaults so that all of the individual traces wouldn't have to be updated to deal with the base64 object.

Thanks again for resurrecting this!

supplyDefault is called during interactions. So basically we don't want to run slow processes on arrays in there.

archmoj avatar Nov 28 '20 15:11 archmoj

One issue I'm running into with having decoding happen in calc. This does prevent the decoding from happening during interactions, but since supplyDefaults does run during interactions, the typed arrays get overridden by the buffer specification in fullData.

You can see the effect of this in your demo. When you click to change the modebar tool the isosurface disapears. This is because fullData ends up with the buffer specification object instead of the typed array.

Not sure the best way forward here. Is there some way to skip supplyDefaults on an attribute that isn't going to have calc run on it?

jonmmease avatar Nov 28 '20 18:11 jonmmease

@jonmmease please note that I merged master into this branch in https://github.com/plotly/plotly.js/pull/5230/commits/7a37d6266d938396f07e7eb2849ee39658a20deb so that we have latest updates in our tests before expanding them for supporting typed arrays.

archmoj avatar Nov 29 '20 20:11 archmoj

@jonmmease please feel free to work & push directly into this branch. I would commit on a dev branch if I could spend some time fixing something.

archmoj avatar Nov 29 '20 20:11 archmoj

@jonmmease please feel free to work & push directly into this branch.

Ok, thanks!

jonmmease avatar Nov 29 '20 20:11 jonmmease

Plotly.js on IE9 is not working and IE9 is unsupported. @jonmmease please pick https://github.com/plotly/plotly.js/commit/aee18f9b84da3f9403ffcd1759e18d85a1d5c02c Or let me know if you want me to push into this branch. FYI - That commit would help all the tests to pass.

archmoj avatar Nov 29 '20 21:11 archmoj

TODO:

  • [ ] add encoding attribute, currently should be set to "base64". Brought to mind by question of alternative encodings in plotly/plotly.py#2943 (comment). encoding of "arraybuffer" when the bvals already holds an array buffer.

In line with @alexcjohnson's https://github.com/plotly/plotly.py/pull/2943#issuecomment-736797612, I also don't think we should add encoding attribute other than b64. Let's drop that from this PR.

archmoj avatar Dec 04 '20 01:12 archmoj

@archmoj @jonmmease this is looking good! I made some comments, but the overall approach I think is sound. I see arrayOk mentioned above, we definitely want to support that - I suspect that will dredge up more cases where we're coercing from a sub-container. Also need to make sure this new syntax passes validation.

alexcjohnson avatar Dec 04 '20 05:12 alexcjohnson

Resolved conflicts and added to v2.3.0 milestone.

archmoj avatar Jun 26 '21 16:06 archmoj

Please refer to the test file for the examples.

archmoj avatar Feb 10 '23 22:02 archmoj

The b64 image test is added by https://github.com/plotly/plotly.js/pull/5230/commits/2df73c437b9bc47155d913b09fb81a08837fbd2e and after handling typed arrays in various places, now succeeds to illustrate diffs in a number of remaining mocks. I'll continue fixing those on a dev branch then push the fixes here :hourglass_flowing_sand:

archmoj avatar Mar 06 '23 19:03 archmoj

Now the (: only :) remaining mocks not matching baselines when converted to b64 are:

    "axes_category_categoryarray",
    "gl2d_point-selection",
    "gl3d_scatter3d-blank-text",
    "point-selection",
    "polar_bar-width-base-offset"

Investigating...

archmoj avatar Mar 07 '23 02:03 archmoj

Given the speed improvements, this would be great to get in!

jacksongoode avatar Oct 03 '23 02:10 jacksongoode

Wooot! nice job folks! :)

nicolaskruchten avatar Jan 05 '24 17:01 nicolaskruchten