returnn
returnn copied to clipboard
`DotLayer`: maybe use 1D conv CuDNN kernel
See LinearLayer for the case of Batch-Feature-major input. In that case, it uses tf.nn.conv1d, to avoid the transpose.
We could use the same in DotLayer for such cases where we can avoid a transpose then.
This maybe requires some good heuristics when it makes sense. Maybe we should also do some actual benchmarks. This might all not be so relevant after all.