modAL icon indicating copy to clipboard operation
modAL copied to clipboard

Ranked batch mode sampling - pre-compute pairwise distance to reduce running time

Open nishkaks opened this issue 6 years ago • 2 comments

The following code is called for computing the pairwise distances for every sample within the batch. This slows down the program significantly for larger batch sizes. https://github.com/modAL-python/modAL/blob/4029dfd4e5f68509a409d509ed706f544472bf25/modAL/batch.py#L93-L96

We can compute the pairwise distances once per batch within ranked_batch(outside the for loop) and pass only the minimum distance array to select_instance and assign it directly to https://github.com/modAL-python/modAL/blob/4029dfd4e5f68509a409d509ed706f544472bf25/modAL/batch.py#L96

There is a significant reduction in running time with this change.

@cosmic-cortex - can I contribute this code change to this repo?

nishkaks avatar Mar 19 '19 13:03 nishkaks

Yes, your contribution is very much welcome! This document contains a few brief contribution guidelines. Let me know if you have any questions, I am happy to help!

cosmic-cortex avatar Mar 19 '19 15:03 cosmic-cortex

Thanks. Will submit a PR over the weekend.

nishkaks avatar Mar 20 '19 09:03 nishkaks