cuspatial
cuspatial copied to clipboard
Decouple `interpolate` functions from trajectory
Description
This PR renames t argument of CubicSpline to indicate that it can be used as general CubicSpline interpolation function.
Other minor documentation refresh are also included.
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
As discussed in weekly, we maintain trajectory module and interpolate module. The interpolate module is refactored as generic interpolation functions. Specifically, argument t is renamed as x for general independent variable.
During rewriting the docs, I was trying to run micro benchmarks to find out about the performance cut-off point and raise warnings for users about a "too-small input". But the result is a bit confusing as the cut off point seems to shift by the ratio of the number of curves and the complexity of the curve. See benchmark: https://gist.github.com/isVoid/108d388173e7199758e31c1ce7f6842a
Perhaps @thomcom can help explain?
Hey @isVoid I'm not sure what you mean by the cut-off point?
@thomcom currently, in CubicSpline code we intentionally to avoid API parity with scipy because we want to avoid user calling the API when the input data size is small: https://github.com/rapidsai/cuspatial/blob/19a8d3fada453f411d1b29a6d9449bc1c0e0c2fe/python/cuspatial/cuspatial/core/interpolate.py#L45-L47
This isn't helpful when user still calls the API with a small data size without aware of the performance implication. In this refactor I attempt to argue that the proper way to design the API is to use the same argument variable as scipy and raise a UserWarning when input size is small. This is similar to how cudf groupby.apply handles many group cases. The key question here is: how to determine when the input size is too small? There must exist some data size when scipy is more performant than cuspatial, and that data size is the "cut-off" point I mentioned above.
Ah, I see. I just ran a few quick benchmarks and I see that scipy is 15x faster than cuspatial on interpolations on my HP Z8 workstation. cuspatial only parallelizes each individual spline fit, so I estimate that if you've got 15 trajectories, cuspatial will be faster at interpolation. I was able to verify this with a set of 15 trajectories of length 1m. Creating the curve object still takes 15x as long (which is basically just creating 15 interpolations in parallel, each taking 15x as long as the hcurve) but interpolating 1m samples across 15 separate trajectories takes 8ms, interpolating 15m samples in scikit learn takes 230ms.
a set of 15 trajectories of length 1m
Do you mean 15 curves, each curve has 1,000,000 vertices?
Creating the curve object still takes 15x as long (which is basically just creating 15 interpolations in parallel, each taking 15x as long as the hcurve)
Do you mean when cuspatial interpolates 15 curves in parallel, the total time of the kernel is 15x as long as scipy? What's the order of time cost here, s? ms?
interpolating 1m samples across 15 separate trajectories takes 8ms, interpolating 15m samples in scikit learn takes 230ms
Sounds like cuspatial has advantage when sampling a complex curve (many-vertices curve) and does not have advantage computing the curve parameters when the total number of curve is small? I think we can raise a performance warning when we see small number of curves (<100) is constructed, do you agree?
Out of scope, but can you fit the curve with scipy and pass the parameter to device and sample with device? If you only have 15 curves, you only have 15x4 parameters.
a set of 15 trajectories of length 1m
Do you mean 15 curves, each curve has 1,000,000 vertices?
Yes
Do you mean when cuspatial interpolates 15 curves in parallel, the total time of the kernel is 15x as long as scipy? What's the order of time cost here,
s?ms?
I think that when cuspatial interpolates 15 curves in parallel the cost is similar to using scipy to compute them in serial.
Sounds like cuspatial has advantage when sampling a complex curve (many-vertices curve) and does not have advantage computing the curve parameters when the total number of curve is small? I think we can raise a performance warning when we see small number of curves (<100) is constructed, do you agree?
I was thinking that < 15 curves was appropriate to compute the warning. I'll write a notebook to concrete the benchmark.
Out of scope, but can you fit the curve with scipy and pass the parameter to device and sample with device? If you only have 15 curves, you only have 15x4 parameters.
This would be theoretically possible but our curve fitting equations are not identical to scipys, iirc.
rerun tests
@gpucibot merge