RapidFuzz
RapidFuzz copied to clipboard
Support multiprocessing for the process module
All the algorithms in the process module should be fairly simple to run in parallel.
This is now supported for process.cdist
using the workers
argument.
@maxbachmann Is multiprocessing supported in extractOne?
So far multiprocessing is only supported by process.cdist
.
So if I want multiprocessing in the 1:n scenario (process.extract
), what would you recommend currently? Is using process.cdist
with a single-item query going to be better than a custom loop over the scorer (turned parallel via Python multiprocessing or some other paradigm)?
Multiprocessing in the 1:n scenario is not really implemented in process.cdist
. Currently it uses multiprocessing for the outer loop, so it has basically no effect when used on 1:n. So right now you should probably use Python multiprocessing.
Understood. Thanks for the clarification!