Albert Zeyer comments

Results 938 comments of


                                            Albert Zeyer

New TF dataset pipeline: draft

Update: I think the draft is mostly ready now. Please check if this suits all possible use cases (multi GPU training, TPU training, having multiple dataset workers, etc, whatever you...

New TF dataset pipeline: draft

> > Yes, but you are again discussing minor implementation details. > > I wanted to start with the easier parts before I assume wrong things. > > So far...

New TF dataset pipeline: draft

One small remaining question: Should this new dataset pipeline (i.e. when you set `dataset_pipeline`) use [distributed TensorFlow](https://github.com/rwth-i6/returnn/wiki/Distributed-TensorFlow) by default (i.e. have one dedicated worker for the dataset, and one worker...

New TF dataset pipeline: draft

> I'm not sure I understand if you guys are using the word "distributed" in the same sense it's used in TF. Distributed across GPUs within a single machine, or...

New TF dataset pipeline: draft

Just as a note: I started implementing this. Beginning with only the bare minimum. The first goal is to get single-GPU training to work. I will soon push some first...

New TF dataset pipeline: draft

> Yup, that was my point. Too much to my taste to call the suckers the same word. Keeping a parameter server on a CPU in a multi-GPU host is...

New TF dataset pipeline: draft

The simple case (no distributed TF, no dedicated dataset loader workers, no Horovod, i.e. no multi-GPU training) should work now, at least with the default pipeline. You can just set...

New TF dataset pipeline: draft

I'm trying with such an implementation now for dynamic batch sizes via `bucket_by_sequence_length`: ``` def dataset_pipeline(context): """ :param InputContext context: :rtype: tensorflow.data.Dataset """ import tensorflow as tf dataset = context.get_returnn_dataset()...

New TF dataset pipeline: draft

Small status report: I think this is mostly done. This issue was anyway only really about the API design, and that seems good (no objections so far by anyone). The...

Layers applying a mask should expand input to all `dyn_size_ext` dimensions

I wonder whether this can be slow and suboptimal in some cases. E.g. in `DotLayer`, you definitely would not want to do that, when the axis is present in one...