returnn issues

PT preload_from_files ignores when no params are matching

There are various cases, e.g. whether we import for train, or for recog, or also just randomly initialize. Maybe it depends on the case whether it is ok to ignore...

albertz

`rf.set_default_device` (`torch.set_default_device`?) before model creation?

Currently `rf.set_default_device` is called *after* the model was created. Specifically, we only use `rf.set_default_device_ctx(self._device)` around the run step, and not otherwise. The model creation happens on CPU, and then we...

albertz

Plan for packed dims

3

This here is open for discussion on what we want in RETURNN. Packed tensors / packed sequences / ragged tensors / jagged tensors / flattened / flat tensors, however you...

albertz

pytest collecting phase is slow

Running some simple selected test takes a long time. Most of that time is spent in the collecting phase. That's very annoying for debugging. Also, I don't quite understand why...

albertz

DistributeFilesDataset _num_shards issue

4

For the latest RETURNN, when I use DistributeFilesDataset, I have this error. ``` File "/nas/models/asr/am/multilingual/16kHz/2024-11-08--jxu-best-rq-pretrain/work/i6_core/tools/git/CloneGitRepositoryJob.LD5f1wKK7LPo/output/returnn/returnn/datasets/basic.py", line 227, in Dataset._create_from_reduce line: ds = cls(**kwargs) locals: ds = cls = kwargs =...

Judyxujj

Automatically sorting dataset does not work with Torch engine forward + MetaDatasets

4

We currently do this in Torch `Engine.forward_with_callback`: ```python ... elif dataset.supports_seq_order_sorting(): # We can sort it. Sort it in reverse to make sure that we have enough memory right at...

albertz

SprintCacheDataset issue with torch backend

For the latest RETURNN, when using torch backend and SprintCacheDataset, I get this error. ``` File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/datapipes/datapipe.py", line 179, in IterDataPipe.__reduce_ex__ line: return super().__reduce_ex__(*args, **kwargs) locals: super = __reduce_ex__ =...

robin-p-schmitt

SimpleHDFWriter extra seq lens not correct, not supporting custom seq lens

In `SimpleHDFWriter.insert_batch`, when using `extra` to put some other data (despite for the main data stream "data"), the seq lens are currently not handled correctly. Currently the logic is: ```python...

albertz

Cleanup `returnn.tf.compat`

Now that TF1 support was dropped (#1668), we can cleanup `returnn.tf.compat` (maybe also other things, but that is the only case I currently know about).

albertz

TensorFlow

Step count is not reset when loading a checkpoint and resetting the epoch

2

Current behavior in the torch engine when using a Checkpoint during training via "import_model_train_epoch1" is to reset the epoch to 0 but keeping the global train step count of the...

mmueller00

returnn
returnn copied to clipboard

Metadata

PT preload_from_files ignores when no params are matching

`rf.set_default_device` (`torch.set_default_device`?) before model creation?

Plan for packed dims

pytest collecting phase is slow

DistributeFilesDataset _num_shards issue

Automatically sorting dataset does not work with Torch engine forward + MetaDatasets

SprintCacheDataset issue with torch backend

SimpleHDFWriter extra seq lens not correct, not supporting custom seq lens

Cleanup `returnn.tf.compat`

Step count is not reset when loading a checkpoint and resetting the epoch

← Metadata

Owner

Metadata

returnn returnn copied to clipboard

Metadata

← Metadata

Owner

Metadata

returnn
returnn copied to clipboard