bhack

Results 1417 comments of bhack

I continue to see too much spawn processes also on other DDP models/code. With `torchrun` and a recent pytorch nightly I see `361` process with just 98 dataloaders.

I'am testing again with new nightlies and we have always the same behaviour. @malfet How we could progress on this?

/cc @andrewkho @gokulavasan Can you support to investigate this behavior change?

I expect to not have more `pytorch` processes then NGPUs X dataloader `num_workers`

On older pytorch I see the number of pytorch processes correlated to dataloader `num_workers`

e.g. `pstree` output with pytorch nightly same job 8 GPUs with 5 dataloaders `num_workers` (DDP) ``` pt_main_thread-+-pt_main_thread-+-32*[pt_main_thread---{pt_main_thread}] | |-5*[pt_main_thread---67*[{pt_main_thread}]] | `-128*[{pt_main_thread}] |-7*[pt_main_thread-+-32*[pt_main_thread---{pt_main_thread}]] | |-5*[pt_main_thread---67*[{pt_main_thread}]]] | `-126*[{pt_main_thread}]] |-python `-64*[{pt_main_thread}] ```

@andrewkho As you want to keep the discussion here is it an issue related to the workers process or to the number of threads (see the other mentioned PR https://github.com/pytorch/pytorch/pull/126199)?...

If you see it was already reproduced also by @tringwald. I don't know if the regression is in dataloader or not but we have a lot of threads. How we...

The environment where you can reproduce this is at https://github.com/pytorch/pytorch/issues/118865#issuecomment-1924745100

Ok I will take it from nightly but I need to wait for another training session to really test it.