dvc icon indicating copy to clipboard operation
dvc copied to clipboard

run/repro: add --jobs for transferring

Open ykacer opened this issue 3 years ago • 7 comments

It seems that we can control number of threads when computing hashes through :

dvc config core.checksum_jobs $n_jobs

but it looks not so easy to do so for transferring (see dvc/objects/transfer.transfer()), it just uses the hardcoded value from /opt/conda/lib/python3.7/site-packages/dvc/fs/base.py.

It is important as i found that for some obscur reasons, multithreading dramatically makes hash and transfering slow, but hopefully : forcing the number of threads to 1 fix the issue. Thanks.

ykacer avatar Feb 03 '22 14:02 ykacer

Hey @ykacer, the number of threads used when pushing/pulling can be set using the -j/--jobs flag, only when this is not provided it will fallback to the default value for the used filesystem

dtrifiro avatar Feb 07 '22 09:02 dtrifiro

Transferring also occurs during dvc run and dvc repro and there is no --jobs.

I have the same problem on my setup, DVC is so slow during checksum and transfers that it becomes unusable. In the config file, setting checksum_jobs to 1 did the trick for checksums but there no such option for transfer.

https://github.com/iterative/dvc/blob/220c633497f07c0ad9af0786cb36f738ea18178d/dvc/data/transfer.py#L173

https://github.com/iterative/dvc/blob/dd5d999644dc053625214b828e62a229e3a19be8/dvc/fs/base.py#L53

fguiotte avatar Apr 26 '22 10:04 fguiotte

Transferring also occurs during dvc run and dvc repro and there is no --jobs.

I have the same problem on my setup, DVC is so slow during checksum and transfers that it becomes unusable. In the config file, setting checksum_jobs to 1 did the trick for checksums but there no such option for transfer.

https://github.com/iterative/dvc/blob/220c633497f07c0ad9af0786cb36f738ea18178d/dvc/data/transfer.py#L173

https://github.com/iterative/dvc/blob/dd5d999644dc053625214b828e62a229e3a19be8/dvc/fs/base.py#L53

For now, we can only manually pull them down with --jobs first.

karajan1001 avatar May 03 '22 12:05 karajan1001

Closing as stale.

efiop avatar Jul 28 '22 15:07 efiop

I still have this problem, should I create a new issue?

fguiotte avatar Jul 29 '22 09:07 fguiotte

@fguiotte Ah, sorry, I see that this issue is pretty clear now. Keeping open.

Looks like just need to add jobs flag to those commands and pass it down. Though jobs in repro/run might be taken by people as parallelization of stages instead of parallelzation of transfer. Though for the former we'll need to invent a new name in the future anyway since --jobs already taken.

efiop avatar Jul 29 '22 12:07 efiop

Thank you for considering this issue :slightly_smiling_face:

Maybe a config option core.transfer_jobs (similar to existing core.checksum_jobs) is easier to add and would do just fine.

fguiotte avatar Aug 01 '22 13:08 fguiotte