raster_geometry
raster_geometry copied to clipboard
Computation time increases rapidly with increasing shape size
Thank you very much for your hard work developing this very useful library!
I was curious if you have any thoughts or suggestions for speeding up computation time for generating large shapes (specifically using nd_superellipsoid()). I have noticed that as the array shape increases, the computation time dramatically increases as well.
%time rg.nd_superellipsoid((5,5,5), 2, rel_sizes=False).astype(np.int_)
%time rg.nd_superellipsoid((50,50,50), 2, rel_sizes=False).astype(np.int_)
%time rg.nd_superellipsoid((500,500,500), 2, rel_sizes=False).astype(np.int_)
%time rg.nd_superellipsoid((1000,1000,1000), 2, rel_sizes=False).astype(np.int_)
CPU times: user 298 µs, sys: 343 µs, total: 641 µs
Wall time: 4.31 ms
CPU times: user 1.24 ms, sys: 1.43 ms, total: 2.67 ms
Wall time: 5.05 ms
CPU times: user 275 ms, sys: 375 ms, total: 650 ms
Wall time: 650 ms
CPU times: user 1.98 s, sys: 2.95 s, total: 4.93 s
Wall time: 4.94 s
I have been struggling for a while to try to come up with a solution, however I have been unsuccessful so far. Do you believe there can be a way to possibly speed up this calculation? Thank you in advance!
Hi, thanks for your interest in and appreciation for this library. Your timings indicate that the computation time increases linearly with the input size -- remember that the input size for (10, 10, 10) is 10 ** 3 == 1000 and for (100, 100, 100) is 100 ** 3 = 1000000. I would say this is what you would normally expect from efficient implementations. For the arbitrary inputs that nd_superellipsoid()
accepts, I believe it would be quite challenging to write something outperforming it. On the other hand, for specific constraints, I assume one could get substantial speed up by giving up some of the flexibility of the original implementation. For example, if your semisizes
are much smaller than the shape you intend to use, it is way more efficient to use the smallest shape containing the rasterization and just copy it in the right position of an "empty" (zeroed?) larger array. Another relatively generic area of improvement would be the implementation of 2D/3D-specialized versions of superellipsoid()
and the like, which may be written with explicit looping to be accelerated with, say, Numba, to obtain possibly substantial speed up. Another direction to explore could be to design optimizations for highly symmetric outputs where for example it is possible to compute a circle by computing just an arc with the "expensive" way and fill the rest with flipped copies. I will not have time to compute such optimization myself in the near future, but I will gladly review a patch implementing (some of) them.