Albert Zeyer

Results 300 issues of Albert Zeyer

For the contrastive loss implementation (#918), we flatten the masked encoder frames via `FlattenBatchLayer` and end up with `B&Packed{'input_masked_frames:masked:time'}` batch dim. For all those frames, we want to create a...

The order of axes should never matter. But when a single dim tag can occur multiple times in a tensor (`Data`), it does matter. E.g. for operations like `SoftmaxOverSpatialLayer` on...

``` % python3 tests/test_TFEngine.py test_engine_train Installed libSegFault.so. TF version: 1.14.0 2020-06-09 14:03:19.232834: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX...

A `Data` object has `time_dim_axis` and `feature_dim_axis`. There are many automatic rules how to automatically define and set them. But these rules are somewhat arbitrary and not always straight-forward. Many...

Recent RETURNN: > RETURNN starting up, version 20200706.123141--git-1bbf93a4, date/time 2020-07-12-09-57-02 (UTC+0200), pid 1594, cwd /work/asr4/zeyer/setups-data/switchboard/2020-06-09--e2e-multi-gpu/data-train/base2.conv2l.specaug4a.wdrop03.adrop01.l2a_1e_4.ctc.devtrain.lrwa.lrt_0005.mgpu4.htd100, Python /work/tools/asr/python/3.7.1_tf_1.14-generic+cuda10.1/bin/python3 Horovod via SGE `-pe mpi 4` and then `mpirun`: ``` cluster-cn-275-pid1594: use_horovod, CUDA_VISIBLE_DEVICES:...

See `LinearLayer` for the case of Batch-Feature-major input. In that case, it uses `tf.nn.conv1d`, to avoid the transpose. We could use the same in `DotLayer` for such cases where we...

good first issue

It's fine if we restrict the usage to only be inside the loop. But in this case, it still could be optimized out, and then it must operate on the...

Functions `get_rec_initial_extra_outputs_shape_invariants` and `get_rec_initial_extra_outputs`. Currently it uses TensorShape. And it has the implicit assumption that it is batch major.

Multiple things: - `Loss` becomes a subclass of `LayerBase` - Loss instances will be treated as normal layers, and the name logic for moving them out of rec loop etc...

cleanup/refactor
difficulty: hard

The idea was simple: based on the inputs/kwargs, determine the output `Data` type (without actually computing the tensor). This was mostly about `dtype`, `shape` and `dim`. Over time, `Data` was...

cleanup/refactor