pythreejs icon indicating copy to clipboard operation
pythreejs copied to clipboard

Performance

Open aothms opened this issue 7 years ago • 12 comments

The following code to render 1000 objects takes 18 seconds on my notebook to complete. (This just as a trivial example, geometry instancing is not the solution I am looking for.) A thousand objects is not a whole lot in my domain. I can imagine this is taking a while because of the bidirectional bridge nature.

  • What is the reason this is taking so long, is there some py->js->py latency somewhere?
  • What can I do to speed this up, e.g. is there a way to defer updates in batches or sth?
import time
import numpy
import itertools

from pythreejs import *
from IPython.display import display

t0 = time.monotonic()

N = range(10)

verts = numpy.array(list(itertools.product([0,1],[0,1],[0])))
indices = numpy.array([
    2,1,0,
    2,3,1,
], dtype=numpy.uint32)

def meshes():
    for ijk in itertools.product(N,N,N):
        xyz = numpy.array(ijk, dtype=float)

        geometry = BufferGeometry(attributes={
            'position': BufferAttribute(array=verts / 2. + xyz),
            'index': BufferAttribute(array=indices)
        })
        
        material = MeshBasicMaterial(color = "#%02x%02x%02x" % tuple(v * 30 for v in ijk))

        yield Mesh(geometry=geometry, material=material)
        
camera = PerspectiveCamera(position=[10,10,10],
                           lookAt=[0,0,0],
                           up=[0,0,1],
                           fov=50)

scene = Scene(children=list(meshes()))

renderer = Renderer(camera=camera,
                    scene = scene,
                    width = 800,
                    height = 600,
                    controls=[OrbitControls(controlling=camera)])

display(renderer)

print(time.monotonic() - t0)

aothms avatar Jan 02 '18 16:01 aothms

The issue here is unfortunately fundamental to the current design of pythreejs: Each object is a widget, which has it's own comm channel. For a scene with many objects, this causes significant overhead (as far as I've been able to acertain, it is the creation of the communication bridge that is the bottle-neck, as this cannot be batched).

There are possible ways (that I know of) to circumvent this:

  1. Look at the implementation of CloneArray (class definition), a custom widget for creating an array of clones of an object. Currently this only allows you to distribute the position across the array, but this logic could be extended to include more properties.
  2. Reduce the number of widgets:
  • Only create one geometry. This can be shared between the meshes.
  • OR Only create one mesh, with all the geometries you wish to use merged into one. Then, dependeing on the complexity of which properties you want to distribute across the objects:

I realize neither of these are ideal, but they are so far the only solutions I've been able to come up with. I'm open to any other ideas you might have.

vidartf avatar Jan 03 '18 10:01 vidartf

Thanks for the insightful response, very much appreciated.

I'm open to any other ideas you might have.

I am new to jupyter widgets in general so I don't have anything to offer at the moment.

Look at the implementation of CloneArray (class definition)

I'll see if I can somehow wrap my head around this. The objects do need to be individually pickable for my use case. For that purpose they are already added to a Group. Perhaps that can somehow also factor in to reuse channels?

aothms avatar Jan 03 '18 13:01 aothms

The objects do need to be individually pickable for my use case.

If using a merged mesh with groups, you should be able to do a reverse lookup to find the original object base on the picked face. Again, this might not be a solution worth pursuing, but at least it might give better start-up performance.

To solve the root of this issue, ipywidgets would have to add support for batching comm open. I'm currently prototyping some code for this. Even if it doesn't solve the problem, it should help in profiling the issue better.

vidartf avatar Jan 04 '18 11:01 vidartf

To solve the root of this issue, ipywidgets would have to add support for batching comm open. I'm currently prototyping some code for this.

I'd love to see this!

jasongrout avatar Jan 05 '18 01:01 jasongrout

you should be able to do a reverse lookup to find the original object base on the picked face

Indeed! I didn't think of that. I am using LineSegment meshes to indicate currently selected object by means of line material colour. That would probably be more difficult, but shouldn't be impossible either.

I'm currently prototyping some code for this.

Awesome, let me know if there's something I can test!

aothms avatar Jan 05 '18 10:01 aothms

I've pushed a prototype for batching comm opens here: https://github.com/vidartf/ipytunnel

Further preliminary findings of profiling:

  • There was a networking snag that sometimes is encountered with the tunnel that is fixed by installing the latest beta of pyzmq (pip install --upgrade --pre pyzmq).
  • With the tunnel, the performance is still not excellent. Creating a few thousand minimal widgets (without any pytreejs things), still take roughly 1ms per widget on both the python side and the JS side. For 5000 widgets like your example code above, that equals 5 seconds on each side, for a total of 10 seconds. I'll profile this further to try to determine if there are any steps of the Widget constructors that can be optimized.

vidartf avatar Jan 05 '18 10:01 vidartf

Further findings:

Most of the time on the Python side of widget initialization is spent in traitlets. I was able to reduce the initialization time by quite a bit (but still not order of magnitude), by using the latest changes in ipytunnel:

from ipytunnel import hold_comm_open, optimize
optimize()   # optimizes comm and traitlets

def meshes():
    with hold_comm_open():  # Batches comm opens
        ...

vidartf avatar Jan 08 '18 13:01 vidartf

With those changes, the python processing time is minimized, as well as networking. I still have left to profile/optimize the JS side of things (it is now the slowest part).

vidartf avatar Jan 08 '18 13:01 vidartf

See also: https://github.com/ipython/traitlets/issues/463

vidartf avatar Jan 08 '18 13:01 vidartf

Sorry for the long pause. With great enthusiasm I looked at your improvements, but I didn't quite understand how to setup ipytunnel.

First my findings for the other changes on some random model (I can elaborate if needed). Especially the caching on traitlets is significant.

conf time (s) percentage of standard
standard 5.127 100%
pyzmq --pre 5.002 98%
pyzmq --pre + lru cache traitlets 4.263 83%

I tried to install ipytunnel in the following way

git clone https://github.com/vidartf/ipytunnel
cd ipytunnel
/opt/conda/bin/pip install --user -e .
jupyter nbextension enable --py --sys-prefix ipytunnel

but got, with zero exit code, but probably didn't actually enable?

Enabling notebook extension jupyter-widget-tunnel/extension...
      - Validating: problems found:
        - require?  X jupyter-widget-tunnel/extension

aothms avatar Jan 29 '18 16:01 aothms

  • pyzmq --pre is only needed if using ipytunnel (i.e. for any message with many buffers).
  • for ipytunnel, you are doing a pip --user install, but a jupyter nbextension enable --sys-prefix. I'm not entirely certain about things here, but I would recommend a jupyter nbextension install [...] step as well.

vidartf avatar Jan 30 '18 13:01 vidartf

@vidartf, @jasongrout I came up with an alternative traitlets optimization in https://github.com/ipython/traitlets/pull/469. It's not quite as good as https://github.com/ipython/traitlets/pull/463, but it's close, and doesn't rely in caching decorators.

rmorshea avatar Feb 03 '18 21:02 rmorshea