make_bag is not thread safe
Creating multiple bags in threads doesn't work:
import bagit
from multiprocessing.pool import ThreadPool
ThreadPool().map(bagit.make_bag, ('dirA', 'dirB'))
This fails with a FileNotFoundError because make_bag uses os.chdir, which is not thread-safe, so the two threads change directories on each other while bagging.
I see there's already a note in the code to stop using chdir: # FIXME: if we calculate full paths we won't need to deal with changing directories. I just wanted to add in particular that the current code prevents multithreading.
(Using a process pool instead of a thread pool would work around this issue, but doesn't help in my particular case because my worker threads need to share memory.)
Indeed – I was also looking at a more comprehensive fix so we could also start supporting non-POSIX interfaces such as S3 in https://github.com/acdha/bagit-python/tree/flexible-fileio but I haven't worked on that in awhile.