Sebastian Hoffmann
Sebastian Hoffmann
The problem in my case is that importing conda packages can sometimes take an exorbitant amount of time. Thus, the wandb daemon process takes too long too start and the...
Hi @kptkin, The update is very appreciated! Here's the startup timing log I just recorded: > WANDB_STARTUP_DEBUG 1675350229.154492 launch WANDB_STARTUP_DEBUG 1675350229.1584668 wait_ports WANDB_STARTUP_DEBUG 1675350264.0804012 before_network WANDB_STARTUP_DEBUG 1675350264.0806823 after_network WANDB_STARTUP_DEBUG 1675350264.080885...
Hey @ejguan I think I would be fine with either. `get_worker_info()` however would have global state(?) and would produce issues when multiple independent datapipes are iterated in parallel (i know,...
Isn't the worker information only relevant when using the MPRS, DistributedReadingService, or both? I don't see how it is technical any different from e.g. sharding information. Also, one thing to...
@ejguan Some specific use case that i would like to handle with this: ``` pipe = pipe.repeat(N_workers).sharding_round_robin_dispatch(SHARDING_PRIORITIES.MULTIPROCESSING) ``` Here, I would like to introduce a custom operation that doesn't know...
Yes, for now I can workaround this. I just wrote this as an example of a real use case and its specific requirements and thought that it might be helpful...
@ejguan I might have a look at this in the next weeks. What would be the appropriate place in torchdata to register these global signal handlers?
It would be good to have the option to profile any pipe, not only prefetcher. This could be achieved by having a `ProfilerPipe` that wraps the source pipe and measures...
To clarify what I meant, here's a rough sketch: ```python @functional_datapipe('profile') class ProfilerIterDataPipe(IterDataPipe): def __init__(self, dp, label=None, iters_per_measurement=1): self.dp = dp self.label = label self.measurements = [] self.iters_per_measurement = iters_per_measurement...
> We should fix this, either: > > 1. Have no default argument > > 2. Default to multiprocessing > > > I prefer 1 so that users are explicitly...