returnn
returnn copied to clipboard
The RWTH extensible training framework for universal recurrent neural networks
ESPnet basically does it like this: - Sort the whole dataset. (The dataset could maybe be directly stored in a sorted way. This would speed up the random access later.)...
This PR adds `dyn_dim_min_sizes` and `dyn_dim_max_sizes` to the command line options of `tools/torch_export_to_onnx.py `. In default, extern data with time dimension size in range [2,25] would be generated for export,...
I have a `MetaDataset` which contains two `HDFDatasets` and I want to apply a sequence list filter file. The `MetaDataset` has an option `seq_list_file`, but the docstring says >You only...
For single GPU training, without PyTorch DataLoader multiprocessing or without MultiProcDataset, the memory usage of the dataset is maybe not too much of a problem. However, it is not uncommon...
``` RETURNN starting up, version 1.20240117.113304+git.54097989, date/time 2024-01-17-23-15-11 (UTC+0000), pid 1130069, cwd /work/asr4/zeyer/setups-data/combined/20 21-05-31/work/i6_core/returnn/training/ReturnnTrainingJob.wmezXtjsvAck/work, Python /work/tools/users/zeyer/py-envs/py3.11-torch2.1/bin/python3.11 RETURNN command line options: ['/u/zeyer/setups/combined/2021-05-31/work/i6_core/returnn/training/ReturnnTrainingJob.wmezXtjsvAck/output/returnn.config'] Hostname: cn-284 Installed native_signal_handler.so. PyTorch: 2.1.0+cu121 (7bcf7da3a268b435777fe87c7794c382f444e86d) ( in...
For debugging, for dumping, but also as an alternative to `torch.compile`-support for the direct PyTorch backend (#1491), it could be useful to have another backend which outputs PyTorch code, instead...
For the first few steps, it could run without tracing/scripting, but then it could enable it and from then on use the Torch graph directly (very similar to TF computation...
Here I want to collect some things to be done to speed up eager-mode execution. Most of it did not really matter in graph-mode execution when those extra things are...
I'm not really sure whether that is possible because we have our own `Tensor` class which wraps around the `torch.Tensor`, and similarly all the PyTorch functions are wrapped inside RF....
It trains fine for a while, and then often I get a CPU OOM, which looks like: ``` [2024-01-04 11:41:05,662] INFO: Start Job: Job Task: run ... RETURNN starting up,...