gosling.js
gosling.js copied to clipboard
Improve performance of json-type by removing deep-clones of spec/data
When inlining long arrays (>10k objects) as values
in a spec via the type: json
, zooming and panning becomes painfully slow. We identified the key reason for this to be the deep-cloning/structuredClone of the spec which happens in several places. See this (draft) PR for the places where we removed the deep-cloning, without any breaking tests/demos: https://github.com/datavisyn/gosling.js/pull/1
We know that the main point of using higlass is to avoid plotting that many points, however the performance improvements are so drastic that refactoring towards less deep-cloning operations should still be considered.
It should also be noted that the sampleLength
property does not have any influence on this, because the bottleneck really is the cloning of the entire values array, causing gosling to be slow even with sampleLength: 100
.
Slow demo (v0.9.25)
(ignore the title, it's the slow version)
Fast demo (without the deep-cloning, using the built version of https://github.com/datavisyn/gosling.js/pull/1)
Open questions:
- [ ] see the ones in https://github.com/datavisyn/gosling.js/pull/1
Thank you @puehringer for exploring this! The performance improvement is really interesting to see.
The only reason I was using the deep copy frequently was to safely keep the original (or intermediate) specs. For example, this.specComplete
copies the original spec so that any values added to this later (e.g., missing values like domain
) are not added to the original spec.
https://github.com/gosling-lang/gosling.js/blob/7edb6c0bc31e2245de091f342e8ffb4b07dd4427/src/core/gosling-track-model.ts#L84
https://github.com/gosling-lang/gosling.js/blob/7edb6c0bc31e2245de091f342e8ffb4b07dd4427/src/core/gosling-track-model.ts#L97
Similarly, whenever the Gosling API exposes the spec to users, I am deep cloning before sending it to users so that whenever users edit the spec, it does not affect to the spec used inside the GoslingComponent.
publish(eventType, {
id: context.viewUid,
spec: structuredClone(this.options.spec), // instead of spec: `spec: this.options.spec,`
shape: { x, y, width, height }
});
...
const data = gosRef.current.api.getTrackInfo('id');
data.spec.tracks = []; // does not affect the spec inside GoslingComponent.
But, I did not consider large JSON values sufficiently at the time of implementing this, and perhaps, some cloning can be removed.
Ultimately, if we want to consider large JSON values, we could keep the values
outside of the spec after compiling the spec (e.g., removing values
fields from the spec and keeping it in the gosling–track-model
separately) so that cloning spec afterward does not decrease the performance.
(Partly related to https://github.com/gosling-lang/gosling.js/issues/508)
@sehilyi I worked on a more concrete example (i.e. with inlined JSON) which can also be used in your editor. It is a simple alignment vis with inlined data for multiple species. This should help reproducing the issues: json_demo.json. I think this is a pretty good benchmark, as it should be possible to make this visualization super smooth. Let me know if we can help with anything, I think https://github.com/datavisyn/gosling.js/pull/1 is a starting point. I also get your points of passing copies around for immutability purposes, and I agree that it should improve performance drastically if spec and (inline json) data would be kept separate.
And here is a quick and easy way to check what is being cloned (it just overrides JSON.stringify, but there is also structuredClone calls). The demo shows how often it is called when zooming/panning, and including the copying of the (large) original data array.
const orig = JSON.stringify;
JSON.stringify = (...args) => {
console.log("STRINGIFY", args)
return orig(...args);
}
Hi @puehringer, thanks for the comment and for looking into this more in detail. The json example you provided will be very helpful. I've started to set up performance testing infrastructure and will use this as my first example to optimize.