data
data copied to clipboard
Set better defaults for `MultiProcessingReadingService`
🚀 The feature
class MultiProcessingReadingService(ReadingServiceInterface):
num_workers: int = get_number_of_cpu_cores()
pin_memory: bool = True
timeout: float
worker_init_fn: Optional[Callable[[int], None]] # Remove this?
prefetch_factor: int = profile_optimal_prefetch_factor(model : nn.Module)
persistent_workers: bool = True
I can add these, opening this issue to discuss whether it's a good idea to change defaults.
+: Users get better out of the box performance with torchdata
-: backward compatibility issues when moving from dataloaderv1
to dataloaderv2
Motivation, pitch
There are many issues on discuss, stack overflow, and blogs describing how people should configure data loaders for optimized performance. Since a lot of the tricks haven't changed like pin_memory = true
or num_workers = num_cpu_cores
or persistent_workers=true
and since we're in the process of developing dataloaderv2
now may be a good time to revisit these default values
- https://www.jpatrickpark.com/post/prefetcher/#:~:text=The%20prefetch_factor%20parameter%20only%20controls,samples%20prefetched%20across%20all%20workers.)
- https://stackoverflow.com/questions/53998282/how-does-the-number-of-workers-parameter-in-pytorch-dataloader-actually-work
- https://discuss.pytorch.org/t/when-to-set-pin-memory-to-true/19723
Alternatives
- Instead of setting reasonable defaults, we can instead extend the
linter.py
to suggest some of these tips if we notice some sources of slowdowns - Do nothing, suggest people read documentation when configuring performance
Additional context
No response
Just a note: MultiProcessingReadingService
(MPRS
) is a temporary reading service. We will change it to the the PrototypeMultiProcessingReadingService
. And, for prefetch_factor
, we might provide it as DataPipes
to let users define it in their pipeline. And, we have decided to make pin_memory
as an adapt_fn. See: https://github.com/pytorch/data/pull/485
For worker_init_fn
, I think we still need it for users to control the state of worker process if they have special use cases.
I like the idea by providing some reasonable number of workers by default, since it makes no sense to provide num_worker=0
to MPRS
to achieve single process iteration.