KDE-diffusion icon indicating copy to clipboard operation
KDE-diffusion copied to clipboard

Regression tests

Open tmbb opened this issue 9 months ago • 5 comments

I have ported the Python code from this package to Rust: https://github.com/tmbb/kde_diffusion/.

Because I don't know how to progam Matlab (and because Matlab seems to be harder to run in general), I have used the Python version as the gold standard. I have used KDE-diffusion to generate a number of test cases based on the distributions analysed in the original Botev et al. paper from 2010 (Table 1 in the paper). I have implemented about half of the distributions.

I wonder if you'd like to add the data I've generated to this repo so that it could be used as a form of regression testing. I can think of a couple things you could do in your code to (maybe) improve performance, and having a number of densities for regression testing could be useful.

If you want, I can help write test cases for those distributions.

Here you can see the demo site where I show how both my implementation and your python implementation deal with the densities from Botev et al: https://tmbb.github.io/kde_diffusion/

tmbb avatar Mar 15 '25 20:03 tmbb

Hi. That's great that there's now a Rust implementation. 🚀

As for the additional tests... I'm not sure I see the point of that. We already have 100% code coverage. Why would the shape of the input data make any difference when the same lines of code get executed?

If you want to make changes that improve performance, we should be adding a performance test. Just so we can measure the speed-up. And it could also be used to benchmark against your Rust implementation.

john-hen avatar Mar 18 '25 07:03 john-hen

As for the additional tests... I'm not sure I see the point of that. We already have 100% code coverage. Why would the shape of the input data make any difference when the same lines of code get executed?

It's possible that numerical instability may cause differences in the output for some input data, and it's possible that performance improvements may lead to numerical instability. But I agree with you that maybe it's a bit overkill to have so many test cases.

If you want to make changes that improve performance, we should be adding a performance test.

Sure, I'll look into it when I have the time.

tmbb avatar Mar 18 '25 10:03 tmbb

Great, I might put some thought into it as well if I find some time. I'm curious to see to what extent Rust can beat NumPy. I don't think by much, as the code is vectorized. I'd aim for a benchmark that runs about 10 seconds on modest hardware, say 4 cores.

john-hen avatar Mar 18 '25 20:03 john-hen

Great, I might put some thought into it as well if I find some time. I'm curious to see to what extent Rust can beat NumPy

I think rust might have the possibility of beating NumPy in this implementation because your implementation runs the Brent root finding algorithm iteratively on a function that is implemented in Python. I believe that it would destroy any performance improvements gained from vectorization. In comparison, the rust implementation is all written in rust, so the solver part should be faster. I can try to benchmark the Python implementatino compared to the Rust one, though.

You should note that the Botev algorithm has some very interesting performance characteristics. It can be faster with a lower nunmber of datapoints, because the Brent root finding algorithm might take a longer time to converge. This makes sense, because the algorithm will have a harder time finding what the optimal bandwidth is if you give it a "less well-defined" shape.

This means that you should benchmark on a large range of inputs. In my rust implementation, algorithm is slower for < 2000 samples compared to > 10,000 samples.

tmbb avatar Mar 18 '25 22:03 tmbb

I've managed to setup a quick benchmark:

Python version (operating on the data from reference1d.npz):

-------------------------------------------- benchmark: 1 tests --------------------------------------------        
Name (time in ms)        Min     Max    Mean  StdDev  Median     IQR  Outliers       OPS  Rounds  Iterations        
------------------------------------------------------------------------------------------------------------        
test_performance      3.2633  6.3101  3.7464  0.2476  3.6904  0.2474   540;152  266.9221    3102           1        
------------------------------------------------------------------------------------------------------------

Rust version (operating on the same data):

Lower bound Estimate Upper bound
Slope 1.3936 ms 1.4211 ms
0.4535447 0.4672452
Mean 1.3557 ms 1.3767 ms
Std. Dev. 78.934 µs 112.67 µs
Median 1.3236 ms 1.3410 ms
MAD 42.195 µs 64.205 µs

The rust version is a bit more than twice as fast, but that's probably not very significant in practice. I'm actually surprised that the difference isn't larger; I guess

In any case, I haven't built the rust version to be faster, I've built it to be able to use it from Elixir.

tmbb avatar Mar 19 '25 12:03 tmbb