biothings.api
biothings.api copied to clipboard
fix utils.diff.diff_collections helper function to run diffs in parallel
utils.diff.diff_collections
helper function is not used directly in the hub, but still a useful tool to test two data collections for their diffs.
https://github.com/biothings/biothings.api/blob/e635db03a0b5930f2436ede8ae2ee3316ac75e58/biothings/utils/diff.py#L114
The existing use_parallel
option was using ipython parallel, which is probably no longer working. We would like to have a new way to run diffs in parallel, without the dependency of ipython parallel. Typically, we don't need to parallelize across multiple machines, parallelizing on multiple CPU cores of the same machine should be good enough.