cNMF icon indicating copy to clipboard operation
cNMF copied to clipboard

multi-proccessing from script?

Open rLannes opened this issue 3 years ago • 3 comments

Hi, first, thank you for providing this tools to the community. I was wondering if there is any way to use multiprocessing from inside a script? Looking at the code when using factorize, it parse the parameters and return a list of execution for the i-worker. But I can't pass a list of i-worker, basically meaning that for in script I have to do analyses sequentially.

Is there any way to achieve multiprocessing from inside a python script? (more convenient) I ended up hacking a bit using the following:

from multiprocessing import Pool 

def factorize_mp_signature(args):
    args[2].factorize(worker_i=args[0],  total_workers=args[1])
    

def wrapper_factorize_mp(total_workers, cnmf):
    list_args = [(x, total_workers, cnmf) for x in range(total_workers)]

    with Pool(total_workers) as p:
        p.map(factorize_mp_signature, list_args)
        p.join()

wrapper_factorize_mp(10, cnmf_obj)

Do you think that is appropriate? If you accept pull request, I could make one. By having a dedicated function, I believe this is easier to integrate and won't compromise the rest of your code base. And adding it as a method of the cNMF class I could pass self directly in wrapper_factorize_mp simplifying the API.

rLannes avatar Aug 01 '22 19:08 rLannes

This does seem reasonable. I will accept the pull request. Apologies for not replying sooner.

dylkot avatar Aug 23 '22 16:08 dylkot

When I use the factorize_multi_process function to try to run the factorize step with multiple processes, I get the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python\Python311\Lib\multiprocessing\spawn.py", line 129, in _main
    prepare(preparation_data)
  File "C:\Program Files\Python\Python311\Lib\multiprocessing\spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Program Files\Python\Python311\Lib\multiprocessing\spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "...", line 7, in <module>
    cnmf_obj.prepare(counts_fn="./adata_tumor.h5ad", components=np.arange(2, 16), n_iter=200, seed=1636, num_highvar_genes=2000)
  File "C:\Program Files\Python\Python311\Lib\site-packages\cnmf\cnmf.py", line 335, in prepare
    sc.write(self.paths['tpm'], tpm)
  File "C:\Program Files\Python\Python311\Lib\site-packages\scanpy\readwrite.py", line 623, in write
    adata.write(
  File "C:\Program Files\Python\Python311\Lib\site-packages\anndata\_core\anndata.py", line 2017, in write_h5ad
    write_h5ad(
  File "C:\Program Files\Python\Python311\Lib\site-packages\anndata\_io\h5ad.py", line 77, in write_h5ad
    with h5py.File(filepath, mode) as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python\Python311\Lib\site-packages\h5py\_hl\files.py", line 562, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python\Python311\Lib\site-packages\h5py\_hl\files.py", line 241, in make_fid
    fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5f.pyx", line 122, in h5py.h5f.create
OSError: [Errno 0] Unable to synchronously create file (unable to lock file, errno = 0, error message = 'No error', Win32 GetLastError() = 33)

This question seems related to #43. So is there any way to run it multiprocessed directly in the python interpreter? Because I want to run it on a Windows computer.

Thanks.

JaceyMarvin99 avatar Jan 29 '24 12:01 JaceyMarvin99

Hi, the error indicates a writing error. "OSError: [Errno 0] Unable to synchronously create file "

I believe this to be related to h5py see those issues: https://github.com/h5py/h5py/issues/1101 https://github.com/h5py/h5py/issues/1220

rLannes avatar Jan 29 '24 15:01 rLannes