Moritz Gunz comments

Results 133 comments of


                                            Moritz Gunz

PostprocessingDataset with multi-processing

1. I agree this should be deterministic from run to run as long as the parameters stay the same. 2. I think even `map_seq_stream` is technically not a problem, we...

PostprocessingDataset with multi-processing

Another data point: I have a setup where I use MultiProcDataset + DistributeFilesDataset around postprocessing datasets postprocessing data from HDFDataset. Since DFD prefetches one subepoch of data, this ends up...

PostprocessingDataset with multi-processing

We just had another case at AppTek where the memory consumption of the workers became a bottleneck in combination w/ DistributeFilesDataset. I think this is due an implementation in the...

PostprocessingDataset with multi-processing

Wrt. implementation, I'm currently thinking about the following: 1. Spawn a number of worker processes. Each worker process gets a (separate) connection to the main proc, and a (separate) Q...

PostprocessingDataset with multi-processing

> Ah sorry, this is for feeding the workers, which is different in MultiProcDataset, where they all use their own sub dataset. Yes! The point here is to avoid duplicating...

PostprocessingDataset with multi-processing

> How exactly? This is basically https://github.com/rwth-i6/returnn/issues/1762. We don't really have a solution for that yet. In this PR (https://github.com/rwth-i6/returnn/pull/1765) I've used (conceptually) `rng_seed_for_worker=self.get_random_seed_for_epoch(epoch=epoch * num_workers + worker_idx)`. I think...

Moritz Gunz

PostprocessingDataset with multi-processing

PostprocessingDataset with multi-processing

PostprocessingDataset with multi-processing

PostprocessingDataset with multi-processing

PostprocessingDataset with multi-processing

PostprocessingDataset with multi-processing

RF weight dropout and variational noise

RF weight dropout and variational noise

RF weight dropout and variational noise

MultiProcDataset + Postprocessing = CPU overcommit?